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Preface 


Tnis  volume  attempts  to  present  not  only  a  definitive  account  of  on« 
aspect  of  a  vast  project  in  vocational-test  development,  but  also  a  usetut 
record  of  the  experiences  gained  in  the  execution  of  that  project.  To  the 
extent  that  it  succeeds,  it  will  be  of  value  not  only  to  aviation  psycholo¬ 
gists  who  carry  on  in  the  service  of  military  or  civilian  authorities,  but 
also  to  vocational  psychologists  in  general.  While  the  tone  of  the  volume 
is  pitched  to  the  car  of  the  professional  psychologist,  an  attempt  has  been 
made  to  avoid  the  more  technical  jargon  of  the  more  specialized  statisti¬ 
cally  minded.  By  iconfining  himself  to  the  less  technical  passages,  the  lay 
reader  may  find  rtiuch  that  is  illuminating  and  interesting  concerning  tests 
and  test  methods, 

Although  there  was  no  attempt,  in  the  program  to  be  described,  to 
follow  any  preconceived  ideal  procedure  of  test  development,  inherent  in 
this  account  is  an  emerging  pattern  of  research,  which,  utilizing  many  of 
the  techniques  of  the  past,  suggests  what  such  a  program  can  be  when 
liberal  support,  in  the  form  of  trained  personnel,  suitable  equipment,  and 
an  almost  unlimited  number  of  experimental  subjects,  is  provided. 

Well-known  test  theories,  and  past  experiences  in  their  application, 
were  brought  to  bear  upon  the  problems  of  vocational  selection  and  clas¬ 
sification  in  a  rather  special  area,  though  it  was  an  area  of  enormous 
scope  from  a  psychological  standpoint.  While  the  theoretical  problem 
and  the  empirical  test  of  a  procedure  always  had  to  be  subordinated  to  the 
fulfilment  of  a  pressing  practical  goal,  there  is,  nevertheless,  many  a  find¬ 
ing  that  transcends  the  immediate  problem  and  its  solution.  The  best 
example  of  this  was  the  utilization  of  factorial  theory  and  methods. 

Factorial  analysis,  brought  into  use  somewhat  incidentally  at  first, 
became  eventually  the  centralizing  and  guiding  principle  m  connection 
with  most  printed-test  development.  It  must  be  admitted  that  the  factorial 
studies  were  neither  as  well  planned  nor  as  well  executed  as  they  would 
have  been  in  a  program  that  had  centered  around  them  from  the  very 
beginning.  Only  near  the  end  of  the  four  years’  research  did  their  fufl 
benefits  become  apparent.  Two  ambitious  intcrcorrelat.on  studies,  panned 
in  the  early  months  of  1945,  were  not  completed  m  time  to  be  treated  in 
this  report.  The  results  of  earlier  analyses  arc  given  liberal  mention,  how¬ 
ever,  and  the  description  and  evaluation  of  tests  lean  heavily,  and  it  is 
believed  rather  effectively,  upon  appeals  to  factorial  information. 

Rather  unique  to  vocational-test  also  « s  ^  »nclus'on  of 

analysis  of  job  criteria  by  the  factorial  mc.hofc  h  ,s  Mod MM m 
this  direction  lies  an  economical,  systematic,  and  dependable  procedure 
for  coverage  of  aptitudes  and  for  fitting  tests  to  vocations. 
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In  the  presentation  of  results,  efforts  have  been  made  to  facilitate 
perusal  of  the  chapters  by  the  reader  by  means  of  a  uniform  type  of 
description  of  tests.  This  was  j*.  >*  easy  in  view  of  the  varied  types  of 
tests,  the  nonuniformity  of  data  available,  and  the  multiple  authorship. 
Where  efforts  along  this  line  have  faltered,  some  of  the  monotony  that 
may  arise  from  repetitious  uniformity  may  be  thus  relieved.  Fortunately, 
there  had  been  considerable  uni  fortuity  anti  system  in  record  keeping  and 
record  treatment,  testifying  to  wise  supervision  and  to  cooperation  among 
field  units.  Variations  in  procedure  over  a  4-year  period  and  over  a 
number  of  research  units  at  different  field  stations  were  inevitable.  Most 
regrettable  of  all  are  the  few  omissions  of  data  which  leave  gaps  that 
were  impossible  to  fill. 

Failures  are  recounted  as  well  as  successes,  but  false  starts  that  never 
reached  the  stage  of  yielding  results  are  best  left  unreported.  Errors 
undoubtedly  still  remain  undetected  in  places,  in  spite  of  diligent  efforts 
to  minimize  their  number  and  seriousness.  Besides  the  editor  and  the 
assistant  to  the  editor,  Capt.  John  1.  Lacey,  who  have  read  all  chapters 
a  number  of  times,  Col.  John  C.  Flanagan,  Maj.  Robert  L.  Thorndike, 
Capt.  Lloyd  G.  Humphreys,  and  Technical  Sgt.  Faul  C.  Davis  have  read 
most  of  them.  All  have  made  valuable  suggestions  that  have  been  incor¬ 
porated.  None  should  be  held  accountable  for  errors  that  still  remain. 

•  The  editor  has  exercised  considerably  more  than  the  usual  editorial 
prerogatives,  in  that  he  has  taken  the  liberty  to  suggest,  and  even  to 
make,  omissions,  modifications,  and  additions  in  places  for  the  sake  of 
greater  internal  consistency  and  uniformity  of  treatment  and  for  the  sake 
of  more  complete  coverage  of  points  that  could  be  brought  out.  From 
this  point  of  view,  the  authors  should  not  be  held  too  strictly  accountable 
for  all  statements  of  theory  or  of  interpretations  that  appear  under  their 
names.  While  the  editor  is  willing  to  assume  responsibility  for  the  publi¬ 
cation  of  statements  of  opinion,  this  does  not  necessarily  mean  that  he 
subscribes  fully  to  all  opinu.  s  offered. 

This  report  and  the  work  for  which  it  stands  are  the  product  of  many 
minds  and  hands — many  more,  indeed,  than  those  whose  names  appear 
herein.  Like  other  reports  in  this  series,  it  represents  a  genuinely  cooper¬ 
ative  program.  The  writers  of  the  chapters  that  follow  have  been,  in 
general,  substantial  contributors  to  the  execution  of  the  program  (though 
not  the  only  substantial  contributors),  as  the  numerous  footnotes  will 
testify.  Other  footnotes  will  show  that  there  were  many  other  sources  of 
test  ideas  and  test  construction.  Unnamed  are  the  numerous  persons, 
civilian  as  well  as  military,  who  have  added  their  contributions  by  ad¬ 
ministering,  scoring,  recording,  calculating,  and  other  activities.  By  way 
of  exception,  there  will  be  mentioned  here  the  names  of  some  who  can¬ 
not  be  cited  adequately  in  footnotes  but  who  should  receive  mention  for 
special  accomplishments.  Two  artists,  Sgt.  Fredrick  H.  Meise  and  Cpl. 
James  Ii.  Ferguson,  designed  illustrations  for  test  items  as  well  as  those 
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pictured  in  this  rei>ort.  Pfc.  Inland  I).  Brokaw  carried  most  of  the  re¬ 
sponsibility  for  assembling  the  statistics  concerning  tests.  Mrs.  Jeant.i* 
E.  Russell  worked  tirelessly  on  the  preparation  of  the  final  manuscript 
as  well  as  in  keeping  organized  files  on  tests.  Maj.  Merrill  I\  Roff  played 
an  active  role  in  the  initial  stages  of  much  of  the  test-development  pro¬ 
gram — much  more  than  references  in  footnotes  would  indicate. 

Footnote  citations  of  credit  for  tost  development  arc  given,  first,  to 
those  who  actually  designed  or  wro‘c  items;  second,  to  those  who  con¬ 
tributed  new  test  ideas;  third,  to  those  who  criticized  tests  with  significant 
consequences;  and  fourth,  to  those  who  supervised  development  in  a 
significant  manner.  In  the  citations,  contributors  are  named  in  alphabetical 
order,  disregarding  military  rank  and  extent  of  contribution.  Many  of 
those  who  were  present  during  the  gestation  and  birth  of  a  test  have 
given  their  judgment  as  to  the  contributors  who  should  be  mentioned.  In 
spite  of  great  efforts  to  be  just,  many  inequities  will  still  be  apparent  to 
some.  It  is  believed,  however,  that  less  injustice  is  done  in  terms  of  un¬ 
warranted  inclusion  or  exclusion  from  a  list  of  contributors,  than  would 
have  been  done  in  attempting  to  rank  contributors  for  relative  merits. 


f* 


J.  P.  Guilford, 
Colonel,  / fir  Corps. 


Bf.verly  Hii.i„s,  Calif.,  September  194& 
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CHAPTER  ONE 


Job  Requirements  of  Aircrew1 


INTRODUCTION 

Contents  of  the  Chapter 

The  main  purpose  of  this  chapter  is  to  describe  very  briefly  the  kind 
of  men  required  for  each  of  the  three  air-crew  assignments — bombardier, 
navigator,  and  pilot.  No  space  will  be  given  to  describing  the  duties  of 
air-crew  members,  since  adequate  descriptions  arc  given  in  other  volumes 
of  this  series.  It  is  sufficient  here  to  give  a  synopsis  of  the  information 
upon  which  were  based  the  many  ideas  ftk  tests  accounted  for  in  this 
volume. 

The  first  section  of  the  chapter  will  present  a  brief  list  of  the  sources 
of  information  concerning  the  psychological  requirements  of  air-crew 
jobs.  Three  sections  will  give  short  descriptions  of  these  requirements 
and  their  relative  importance  for  each  air-crew  job.  A  final  section  states 
so'nC  very  general  considerations. 

SOURCES  OF  INFORMATION 

An  examination  of  the  list  of  sources  of  information  concerning  air 
crew  reveals  that  many  different  approaches  were  made  to  job  analyses. 
For  a  more  adequate  account  of  them,  the  reader  is  referred  to  Report 
Nos.  3,  8,  9,  and  10.  It  is  recognized  that  most  of  these  procedures  have 
their  weaknesses,  but  since  we  are  concerned  here  only  with  positive 
values,  no  criticisms  will  be  offered. 

Types  of  Information 

The  various  types  of  information  and  their  sources  were  as  follows: 

Faculty  board  proceedings, — When  a  student  is  eliminated  from  pilot 
training,  his  instructors  and  check'plots  prepare  a  statement  concerning 
(1)  the  student’s  personal  traits,  emphasizing  deficiencies,  and  (2)  the 
manner  in  which  he  flew  his  plane.  Similar  rejwrts  arc  also  available  in 
connection  with  bombardier  and  navigator  training. 

Flying  awluation  board  reports. — If  an. air-crew  man  who  has  earned 

his  wings  is  found  to  be  unsuited  to  tactical  flying  for  any  reason,  hi:  zscz 

is  submitted  to  a  local  evaluation  board.  When  the  board  has  reached  a 

decision,  the  report  with  recommendation  is  forwarded  to  a  central  board 

at  Headquarters,  Army  Air  Forces.  The  man  is  then  either  kept  on  flying 

status  or  is  reclassified.  In  the  report,  statements  regarding  his  experi- 
_______________  \ 
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cncc,  present  assignment,  attitude  towards  his  job,  and  apparent  profi¬ 
ciency  in  flying  are  included. 

Observations  of  training. — From  the  first  days  of  the  AAF  psycho¬ 
logical  program,  aviation  psychologists  attempted  to  learn  all  that  was 
possible  about  flying.  Manuals,  training  memoranda,  and  textbooks  were 
studied  by  selected  personnel.  Missions  were  flown  with  students.  Officers 
and  enlisted  men  were  sent  out  under  temporary  duty  orders  to  make 
studies  of  specific  air-crew  assignments  or  of  some  phase  of  the  assign¬ 
ment.  One  example  is  an  extended  visit  of  an  officer  and  several  enlisted 
men  at  a  primary  pilot  school  *  to  make  a  study  of  the  task  of  landing  an 
airplane.  Another  visit  was  made  to  a  bomber  training  school  to  observe 
the  activities  of  crew  members  in  training  for  combat  operations.  At  later 
stages,  rated  officers — pilots,  bombardiers,  and  navigators — were  as¬ 
signed  to  various  psychological  units  or  projects  for  extended  periods  of 
duly.  Some  of  these  had  had  some  degree  of  professional  psychological  • 
training. 

Formal  job  analyses. — These  analyses  consisted  of  setting  up  checklist 
forms,  similar  to  those  used  in  industries,  and  making  a  fairly  complete 
survey  of  men  and  their  jobs,  with  special  emphasis  upon  the  psychologi¬ 
cal  traits  required.  The  reports  of  results  included  such  topics  as :  general 
duties  of  pilot  and  commander  of  crew  (in  the  case  of  a  pilot  analysis) ; 
nature  of  work,  including  location  in  airplane,  posture,  and  working  area ; 
equipment  and  tools,  including  delicate,  as  well  as  gross,  manual  controls ; 
computational  aids,  such  as  slide  rules  and  tables ;  types  of  work  required 
(described  under  sequence  of  duties) ;  movements  required,  duration  of 
work,  and  speed  required;  related  vocations  or  avocations;  responsibili¬ 
ties;  job  satisfactions;  description  of  worker  as  to  experiences,  physical 
and  mental  abilities;  and  personal  qualities,  including  interests  and 
attitudes. 

Interviews  with  eliminated  cadets. — Realizing  that  there  were  weak¬ 
nesses  in  the  reports  of  faculty  boards  and  flying-evaluation  boards,  an 
attempt  was  made  to  understand  the  job  of  the  pilot  or,  to  be  more  spe-  > 
cific,  of  the  student  in  pilot  training  by  an  interview  approach. 

Rating  scales  for  aziation  cadets. — Beginning  early  in  1942  a  profi¬ 
ciency  rating  scale  designed  by  aviation  psychologists  was  used  in  all  pri¬ 
mary  pilot  schools.  As  contrasted  to  the  faculty-board  proceedings  in 
which  instructors  stated  in  their  own  terminology  why  a  student  was 
eliminated,  the  rating  scale  carried  a  list  of  20  traits,  which  thus  provided 
a  report  in  standardized  terminology. 

Ratings  by  students  concerning  difficulties  experienced  in  learning  to 
fly. — An  interview  rating  scale  containing  24  items  was  presented  to  stu¬ 
dents  in  basic  pilot  training.  They  were  asked  to  indicate  on  a  checklist 

i  Tlit  primary  xkool  provulti  the  6r»t  *!if*  of  flying  traininf  lor  the  pilot.  Thi*  tUf*  la 
KBrlimn.  tut  rarely,  filmed  to  aa  elementary  pilot  traininf.  Primary  traininf  ii  preceded 
by  a  pee-Bifbt  phaae,  «bicb  ii  composed  entirely  of  (raunti  tcbool  Courier,  and  is  foUowrd  by 
lb*  basic  and  advanced  Syinf  atboot  pbaaca. 
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Lhc  extent  to  which  they  had  found  each  item  difficult  in  (1)  primary 
training,  and  (2)  basic  training. 

Grade-slip  entries. — As  a  routine  procedure,  each  pilot  instructor  made 
an  entry  on  a  grade  slip  indicating  any  difficulties  or  weaknesses 
that  the  student  with  low  grades  exhibited.  These  data  were  analyzed  and 
categorized  in  more  suitable  form  for  a  job-analysis  study. 

Clinical  studies. — Several  very  ambitious  clinical  studies  were  made  in 
an  attempt  to  reveal  fundamental  personal  characteristics  that  are  im¬ 
portant  determiners  of  air-crew  success.  This  entailed  observing,  inter¬ 
viewing,  examining,  and  measuring  the  performances  of  individuals  in 
training  situations.  In  this  connection,  psychologists  lived  with  students 
at  flying  schools,  taking  flying  training  with  them,  messing  with  them,  and 
living  in  cadet  quarters  with  them. 

Anecdotal  summaries. — In  several  instances,  the  anecdotal  method  was 
used  in  preparing  reports  on  job  analysis.  Collection  of  instances  believed 
to  show  good  and  poor  judgment  is  one  example  of  the  u»c  of  the  metho'i 

Instructor J  and  supervisors?  checklist  data. — There  nave  been  several  i 

variations  of  this  approach.  In  one,  flying  instructors  merely  ranked  20 
items  according  to  their  importance  ;  in  another,  they  checked  the  im-  1 

portant  ones ;  and  in  the  third,  they  rated  each  one  according  to  a  numeri¬ 
cal  scale.  Average  ranks,  frequency  of  mention,  and  average  ratings  were 
used  in  the  summaries. 

In  one  extensive  study  in  the  Eighth,  Ninth,  Twelfth,  and  Fifteenth  j 

Air  Forces,  supervisors  of  air-crew  personnel  were  asked  to  indicate  the  i 

relative  importance  of  each  of  20  traits  for  individuals  "capable  of  doing 
superior  work  of  a  specific  type  in  combat  operations.”  These  officers  in-  i 

dicatcd,  on  a  9-point  rating  scale,  the  minimum  acceptable  standards  which 
they  believed  should  be  met  for  each  of  these  traits  in  selecting  and  dassi-  | 

fying  air-crew  personnel  j 

ANALYSIS  OF  THE  BOMBARDIER’S  JOB  j 

Psychological  Description 

Of  the  many  psychological  characteristics  required  of  the  bombardier, 
perhaps  the  most  important  are  the  ability  to  attend  to  a  variety  of  de¬ 
tailed  activities  and  the  ability  to  remember  the  serial  order  of  events.  The 
bombardier  must  be  able  to  judge  minimal  rates  of  movement  (rate  and  < 

drift)  and  must  be  able  to  synchronize  these  movements.  This  calls  not 
only  for  perceptual  judgment,  but  precision  of  eye-hand  coordination.  He 
must  be  able  to  work  calmly  under  pressure  of  time,  and  he  must,  there¬ 
fore,  be  free  from  fear  or  nervousness.  lie  must  not  be  tense  as  he  co¬ 
ordinates  the  movements  of  the  knobs  in  killing  rate  and  drift.  He  must 
be  alert  to  his  job,  work  rapidly,  and  make  quick  adaptations.  He  must 
be  able  to  identify  the  target  and  orient  himself  spatially.  These  are  some 
of  the  principal  traits  *  demanded  of  the  bombardier. 

*  Id  (kii  Report  »k*  tern  win  bt  um4  la  •  vary  *«n«r»I  kim  la  UcJwda  abfllriaa 

(*t*  ck  U). 
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Relative  Importance  of  Various  Categories 

Tables  1.1  and  1.2  show  respectively:  (1)  A  list  of  factors  causing 
elimination  from  bombardier  schools  and  percentages  of  times  they  were 
mentioned  in  elimination  reports  on  102  climinces  at  1  school  and  (2) 
ratings  on  a  9- point  scale  of  20  psychological  categories,  made  by  super¬ 
visors  of  combat  teams. 

Comparison  of  the  liuo  analyses. — The  lists  of  traits  in  tables  t.l  and 
1.2  arc  so  different  in  terminology  that  it  is  perhaps  futile  to  attempt  to 
look  for  similarities  and  differences  in  degree.  Were  they  more  alike,  one 
might  well  look  to  them  to  supply  some  information  regarding  the  com- 
munality  of  requirements  for  training  and  for  combat.  Unfortunately, 
this  comparison  is  limited  to  some  very  general  observations.  In  the  train- 


Tabix  1.!. —  Frequencies  of  reasons  for  elimination  from  bombardier  training  as 
found  in  102  elimination  board  reports  in  one  school 


Categoric* 

Code* 

Percent  times 
mentioned 

Ability  tt  execute  a  series  ftf  activities  accurately 
and  in  proper  order  . . ...a.. . . . . . . 

c 

79 

Ability  to”  learn  . . . . . 

I 

57 

Eye-hand  coordination  . . 

c 

55 

Ability  tn  work  rapidly . . . . . . . 

A 

4 7 

Ability  to  make  £ne  and  smooth  manual  movement* . . . . 

z 

41 

Nervousness  and  tense  neat . . . 

P 

37 

2S 

Ability  to  adapt  tn  mutual  circumstancM  . . . 

P 

ScIf-conMract ' . . . 

P 

22 

Judgment  . . 

P 

11 

Selfanalyeit . . . . 

I 

11 

Intercut  and  motivation  . . . 

P 

11 

Ability  to  perform  arithmetic  computations  accurately 
and  rapidly  . . . 

I 

19 

Orientation  . . . . . 

A 

• 

A 

4 

1  The  16  item  were  grouped  into  4  main  categories,  each  with  a  code  letter  at  follow*: 
Intelligence  and  judgment  (1),  alertness  and  observation  (A),  coordination  and  technique  (C), 
and  personality  and  temperament  (P). 


Table  \2. —  Average  ratings  of  importance  of  psychological  categories  for  combat 

bombardiers * 


Category 


Orientation  and  observ¬ 
ation  . 

Emotional  control . 

Speed  ol  decision  and 

action  . 

Judgment  . . 

Finger  deitefity  . 

Memory  . 

Dial  and  tabW 

reading  . 

Diviaion  of  attention  ... 
Serial  reaction  time  .... 
Dependability  . 


Mean  rating 

Category 

Coordination  . . 

7.1 

Motivation  . . 

7.4 

leadership  . 

Arithmetic  calculation#  .. 

7.1 

Estimation  of  speed 

7.0 

and  distant*  . 

4.9 

Reading  comprehension  . 

4.4 

Viiualiiation  of  the 

41 

flight  count  . . 

Mechanical  camprt- 

4.1 

henaion . . 

4.1 

Mathematic*  . . 

4.1 

Arithmetic  reasoning  ... 

4.1 

4.1 

M 

S.« 


s.« 

J.7 


5.4 

U 

5.1 


'  Patera  were  41  aquadrou  aid  group  bombardier*.  The  rating*  were  made  on  g  I  ♦  point 

oral*  under  the  intiructieu  “eml*  the  number  indicating  the  minimum  standard  which  you 
believed  should  be  required.”  Dcfcniiiona  of  acat*  number*  were  roughly  a*  follow*:  ♦ — excep¬ 
tion*!;  7— very  much  better  than  average;  I  better  than  average;  I — average  entitled  man; 
I— worm  than  average. 
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ing  data,  quick,  smooth,  and  accurate  motor  coordinations  are  stressed  as 
important;  personality  traits  are  moderately  important;  and  perceptual 
and  intellectual  traits — including  arithmetic  calculations — are  near  the  bot¬ 
tom  of  the  list.  In  combat,  on  the  other  hand,  intellectual  and  perceptual 
abilities  seem  to  rate  higher,  though  arithmetic  calculations  still  are  rela¬ 
tively  low,  and  some  personality  traits  increase  in  importance.  On  the 
whole,  there  is  little  agreement  between  the  two  lists.  Whether  there 
would  be  a  closer  agreement  between  two  independent  groups  of  judges 
either  in  training  or  in  combat  activities  is  unknown.4 


ANALYSIS  OF  THE  NAVIGATOR’S  JOB 


Psychological  Description 

It  has  been  said  that  the  navigator  is  the  most  intellectual  of  the  air 
crew,  that  he  is  pedagogically  inclined,  and  academically  motivated.  Every 
analysis  of  his  duties  has  emphasized  the  high  degree  of  mentality  re¬ 
quired  by  this  position.  Whether  or  not  the  good  navigator  need  be  a  more 
intelligent  person  than  the  good  pilot  or  good  bombardier  can  be  ques¬ 
tioned.  The  matter  cannot  be  settled  without  defining  “intellectual"  and 
"intelligent"  in  some  demonstrable  terms. 

There  is  no  doubt  that  in  certain  abilities  the  navigator  must  excel.  The 
very  nature  of  his  work  demands  that  he  be  interested  in  and  have  some 
knowledge  of  mathematics,  though  this  need  not  include  "higher"  mathe¬ 
matics.  It  is  certain  also  that  he  must  readily  understand  abstract  con¬ 
cepts.  As  we  examine  reports  of  eliminated  cadets  and  job  descriptions 
prepared  by  instructors  in  navigation,  we  are  impressed  with  the  large 
number  of  other  traits  needed  by  the  navigator.  These  include  such  traits 
as  the  ability  to  work  rapidly,  accurately,  and  neatly.  With  respect  to  the 
last-mentioned  trait,  it  is  a  fact  that  a  number  of  students  have  been  elim¬ 
inated  because  they  were  either  poor  draftsmen  or  could  not  write  leg¬ 
ibly  enough  to  read  their  own  figures  while  making  computations  in  the 
air.  In  other  instances  serious  errors  have  been  made  in  the  navigator’s 
log  books  for  a  dozen  or  more  reasons,  not  the  least  of  which  were  errors 
in  simple  addition  and  subtraction. 

The  navigator  must  also  be  thorough  in  his  work  and  able  to  analyze 
and  to  correct  his  own  errors.  He  must  exercise  good  judgment  and  show 
the  ability  to  concentrate  effectively  on  navigational  problems  over  long 
periods  of  time.  Some  individuals  have  been  eliminated  because  they 
failed  to  prcchcck  their  instruments  and  others  because  they  failed  to  re¬ 
port  defective  instruments  upon  landing.  While  manual  skills  are  perhaps 
not  so  important  as  intellectual  abilities,  we  do  find  that  some  navigators 
have  difficulty  in  manipulating  such  instruments  as  the  drift  meter  and 
the  pelorus  or  astrocompass. 

The  navigator’s  confidence  in  his  work  must  be  a  balanced  mental  trait 
This  means  that  he  must  have  neither  too  much  nor  too  little  confidence. 

«A  mmt*  •<  «a«  Um**r4*t  U  fw»4  to  X*.  t  at  ikU 
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He  must  not  be  so  over  confident  tflat  he  takes  only  one  reading  with  the 
satisfaction  that  it  is  correct.  He  has  been  taught  that  frequent  readings 
minimize  the  probability,  as  well  as  the  size,  of  errors  that  can  be  made. 
He  must  not  be  so  lacking  in  confidence  that  he  takes  an  excessive  number 
of  readings.  By  so  doing,  he  passes  on  his  apparent  lack  of  confidence  to 
the  rest  of  the  crew  members  who  may  consequently  suffer  lowered 
morale.  The  navigator  must  display  coolness  and  deliberation.  This  is 
especially  true  in  combat,  when,  after  the  bombing  run,  the  navigator 
must  keep  oriented  in  the  midst  of  battle. 

Another  important  characteristic  of  the  good  navigator  is  foresight  and 
planning.  In  one  combat  mission,  a  forced  landing  was  necessary.  When 
the  pitot  ended  his  navigator  for  suggestions  as  to  an  emergency  landing 
field,  he  was  immediately  tokl  that  a  few  miles  further  on  there  was  a 
beach  on  which  the  navigator  thought  it  might  be  possible  to  make  a  crash 
landing.  Later  H  was  Warned  that  it  was  the  practice  of  this  navigator  to 
note  all  level  fields  and  beaches  .that  looked  favorable  for  a  crash  landing 
and  to  marie  them  on  his  map  along  the  course  of  Bight  for  future 
reforms. 

Navigators  arc  constantly  impressed  with  the  necessity  of  being  famil¬ 
iar  with  several  forms  of  navigation.  If  the  navigator  is  flying  CAVU 
most  ©f  Ac  time,  he  may  neglect  to  keep  up  on  dead-reckoning  procedures. 
Finally,  the  navigator  must  be  a  leader  of  men,  because  he  is  usually  con¬ 
sidered  third  in  command  of  the  ship. 

Retail**  Importance  of  Navigator  Qualities 

Table  1.5  shows  some  of  the  items  commonly  checked  by  instructors  as 
causes  for  elimination  from  navigation  training,  while  table  1.4  presents 
combat  data  comparable  to  those  previously  given  in  table  1.2  for  the 

bombanfief. 

Comparison  of  the  two  analyses. — In  training,  arithmetic  computation 
— in  terms  of  both  speed  and  accuracy — ranks  very  high.  Judgment,  vis¬ 
ualization,  reasoning,  and  ability  to  learn  abstract  concepts  are  also  re¬ 
garded  as  very  important.  Among  temperamental  traits,  neatness  and 
orderliness  arc  deemed  significant.  Other  personality  traits  are  of  mod¬ 
erate  or  low  importance,  and  motor  coordination  ability  is  not  mentioned 
at  all. 

In  combat,  certain  temperamental  traits  come  up  to  the  head  of  the  list, 
equaling  or  excelling  intellectual  traits,  such  as  judgment  and  arithmetical 
computations  which  are  still  prized.  Perceptual  qualities  arc  of  moderate 
or  low  significance  in  training,  but  a  perceptual  trait — orientation  and 
observation— heads  the  list  for  combat  performance.  Motor  coordination 
is  at  the  bottom  of  the  list  as  judged  by  combat  supervisors,  in  good  agree¬ 
ment  with  opinion  of  instructors  in  training  schools.* 

•  A  sack  fuller  »c<ount  of  lit  navigotor  will  W  found  in  report  Nk  10  of  lUi  min. 
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Table  1.3. —  Percentage  of  times  troth  were  checked  by  112  instructors  as  cause 


of  elimination  from  navigation  school 


Category  Percentage 

Inability  to  correct  own  errors  .  .  76 

Errors  in  simple  arithmetic  computations  .  75 

Slowness  in  learning  new  concepts  . . .  74 

Poor  judgment  .  73 

Slowness  in  simple  arithmetic  computations  .  72 

Incapable  of  adequate  visualization  to  perform  celestial  work  .  68 

Inability  to  meet  and  adjust  effectively  to  new  situations  (especially  in  the  air)  68 

Lack  of  analytical  mind  (regardless  of  mathematical  training)  . . .  59 

Lack  of  orderliness  in  work  procedures  or  log  book .  56 

Lack  of  confidence  .  49 

Inability  to  concentrate  effectively  over  prolonged  periods  of  time 

(examinations  and  flights)  .  48 

Nervousness  in  examinations . . .  41 

Lack  of  initiative  . .  41 

Nervousness  in  flights  . 39 

Lack  of  neatness  in  chart  work  and  log  book  .  37 

Inability  to  use  computer  . . .  33 

Inadequate  mathematical  background  .  30 

Lack  of  necessary  emotional  stability  .  30 

Inability  to  read  drift  .  26 

Inability  to  use  tables  or  graphs  .  23 

Lack  of  interest  . 23 

Inability  to  shoot  with  sextant1  . . .  22 

Inadequate  general  educational  background  .  19 

Inability  to  learn  necessary  technical  terms  .  17 

Inability  to  read  or  use  instruments  .  14 

Airsickness  (as  a  contributing  factor)  .  0 

Inability  or  unwillingness  to  accept  new  concepts  or  techniques .  7 

Dislike  of  flying  . 4 

Fear  of  flying . .  2 


1  Based  on  elimination  la  celestial  aavifadaa  an hr. 


ANALYSIS  OF  THE  PILOTS  JOB 
Psycholojical  Description 

In  general,  the  pilot  must  be  a  person  who  thinks  and  acts  in  a  quick 
and  positive  manner.  This  is  perhaps  more  true  ot  the  fighter  pilot  than 
of  the  bomber  pilot,  who  can  at  times  be  more  deliberate  in  his  thinking. 
A  similar  difference  exists  between  fighter  and  bomber  pilots  with  regard 
to  speed  of  action.  The  latter’s  actions  should  be  highly  characterized  by 
reliability  and  dependability. 

Both  types  of  pilot  should  show  good  judgment,  although  that  of  the 
bomber  pilot  is  expected  to  be  more  mature.  It  is  important  for  both  men 
to  remember  procedures.  In  the  case  of  the  bomber  pilot  there  are  a  few 
more  things  to  do,  and  the  order  in  which  they  are  done  is  of  great  im¬ 
portance.  The  fighter  pilot  must  be  far  more  alert  to  wliat  is  going  on 
around  him  than  the  bomber  pilot,  because  the  latter  can  depend  upon 
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Tabu  1.4.—  Average  ratings  of  import  ante  of  psychological  categories  for 

combat  navigators* 


Category  Met*  rating 

Orientation  and  observation  .  7.8 

Emotional  control  . 7.3 

Dependability  . 72 

Judgment  . 7.1 

Speed  of  decision  and  action .  7.1 

Keadii.g  comprehension  . * .  7.0 

Arithmetic  calculations  . 6.9 

Memory  .  6.8 

Division  of  attention  . 6.8 

Dial  and  table  reading  .  6.6 

Estimation  of  speed  and  distance  .  6.6 

Leadership  . 6.6 

Motivation  . 6.5 

Visualization  of  the  flight  course  .  6.4 

Arithmetic  reasoning  .  5.9 

Serial  reaction  time .  5.9 

Mathematics  . 5.5 

Mechanical  comprehension  .  5J 

Finger  dexterity .  5.0 

Coordination  .  4.8 


1  Sc*  footnote  to  ublt  1.2.  Tit  iiltil  were  77  tquadroa  and  group  navigator*. 


his  many  crew  members  to  inform  him  of  the  presence  and  activity  of 
enemy  airplanes. 

Differences  between  the  two  types  of  pilots  arc  more  apparent  in  tem¬ 
peramental  traits  than  in  abilities.  The  good  fighter  pilot  should  be  an 
aggressive  individual  but,  in  that  aggressiveness,  should  not  lose  control 
of  his  emotions.  A  trait  common  to  both  is  the  ability  to  work  in  a  team. 
The  bomlKT  pilot  must  inspire  his  crew,  give  them  a  feeling  of  confidence 
in  him  and  in  his  decisions,  and  develop  in  them  a  spirit  of  cooperation. 
He  is  expected  to  develop  a  comradeship  with  his  crew  without  permitting 
the  element  of  familiarity  to  destroy  his  discipline.  The  fighter  pilot  does 
not  always  function  as  a  "lone  eagle"  in  his  combat  operations.  He  must 
frequently  cooperate  with  others. 

In  addition,  the  average  pilot  must  have,  at  least  to  a  moderate  degree, 
abilities  ascribed  to  the  navigator.  He  must  possess  ability  to  orient  him¬ 
self  quickly  and  to  match  geographical  landmarks  with  their  representa¬ 
tions  on  a  map.  Some  pilots,  particularly  fighter  pilots,  must  also  possess 
characteristics  of  a  good  gunner,  since  they  may  be  Hying  pursuit  ships 
and  engaging  in  either  air-to-air  firing  or  in  strafing  activities. 

Relative  Importance  of  Trails  of  Pilots 

Tables  1.5  through  1  7  show  the  relative  importance  oc  various  psycho¬ 
logical  categories  as  based  upon  elimination  records,  statements  of  elimi¬ 
nated  cadets,  statements  concerning  reclassified  pilots,  and  judgments  of 
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supervisors  of  combat  teams.  Since  most  of  the  job-analysis  work  has 
been  done  upon  the  pilot,  there  are  many  such  tables  available,  but  those 
presented  here  will  suffice  to  indicate  some  of  ihc  more  important  findings 
and  some  of  the  weaknesses  in  the  job  analyses  which  have  been  done.  In 
table  1.5,  five  studies  have  been  summarized.  This  summarization  was 
possible  because  .the  same  categories  and  method  of  evaluation  had  been 
used.  The  categories  presented  here  have  become  somewhat  standard  in 
the  AAF  Aviation  Psychology  program. 

Table  1.6  presents  evidence  concerning  combat  requirements.  Super¬ 
visors  of  combat  teams  and  others  were  asked  to  indicate  on  a  9-point 
rating  scale  the  extent  to  which  each  psychological  factor  is  important  to 
a  pilot.  From  these  data  can  Ik-  obtained  some  conception  of  what  the 
supervisors  think  arc  the  important  traits  for  fighter  and  bomber  pilots. 

Table  1.7  shows  the  percentage  of  times  various  reasons  were  men¬ 
tioned  by  150  eliminated  pilots  as  reasons  for  elimination. 

Comparison  of  different  analyses. — An  examination  of  tables  1.5 
through  1.7  will  show  that  result-  concerning  traits  regarded  as  important 
for  the  pilot  depend  upon  a  number  of  factors  :  ^1)  Whether  training  or 
combat  is  the  test  of  proficiency;  (2)  stage  of  training;  (3)  type  of  air¬ 
plane;  and  (4)  whether  judgment  is  made  by  boards,  instructors,  or  by 
students. 

In  primary  training,  the  leading  trails  as  indicated  in  elimination-board 
proceedings  are  judgment,  coordination,  progress  in  developing  skills. 


Table  1.5.  —  Percentage  of  times  categories  were  mentioned  at  0  cause  of 
elimination  or  reclassification  in  filol  training* 


'Jtttiorin 

Elementary 

eliminations 

Advanced 

eliminations 

Operational* 
reel  jaikatiwu 

X  = 
1,000 

X  = 
1.000 

Single 
engine 
N  =  100 

Twin 

engine 
N  =  I0C 

N  =  100 

N  =  100 

A.  Intelligence  and  judgment . 

•61 

9d 

IS 

22 

Iud|Rlfflt . . . 

rore.ight  and  .tanning . . . 

i0 

<1 

52 

41 

14 

47 

<5 

25 

12 

I 

17 

t 

Memoijr  . 

J  4 

29 

52 

21 

| 

5 

Comprehension  . . .  . 

17 

15 

25 

27 

7 

U.  Alertness  and  observation  . 

70 

«9 

77 

7 

Visualisation  of  flight  course . 

36 

50 

46 

41 

i 

9 

estimation  of  speed  and  distance  . . 

.10 

51 

27 

58 

i 

1 

Sense  of  sustentation . 

2  4 

54 

25 

7 

5 

9 

Division  of  attention  . . . 

.'a 

41 

45 

14 

5 

2 

9 

Speed  of  deciMon  and  reaction  . 

IS 

59 

55 

40 

7 

2 

C.  Coordination  and  techniques  . 

it 

91 

69 

4 

Coordination  . 

51 

56 

74 

57 

9 

9 

Appropriateness  of  controls  used . 

21 

1* 

10 

15 

0 

9 

feel  of  controls . 

2 

57 

25 

5 

0 

9 

Smoothness  of  control  movement  . 

22 

25 

50 

25 

0 

1 

Progress  in  developing  technique . 

54 

42 

52 

54 

47 

41 

D.  Personality  and  temperament . 

4J 

•  , 

59 

*7 

91 

Absence  of  tenseness  . . . 

22 

29 

29 

50 

1 

12 

12 

26 

If 

)7 

7 

f 

Absence  of  fear  and  apprehension  . 

u 

17 

7 

II 

$7 

99 

Suitable  temperament  . . 

9 

It 

a 

IS 

II 

21 

Motivation  and  attitude*  . . . 

« 

it 

19 

20 

52 

1  Ctfctnutti  4*  not  ltd!  100,  tiwt  matt  thin  on*  IkIM  is  give*  for  «k! 


tliniiuliM. 

'A  very  taull  ptranUit  of  these  wn  aituaDy  in  ronUl 

*  Pmnutti  la  italics  tefer  to  ttUlm  IrniumiM  with  *kick  of  trails  war* 

nratiooti 
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foresight  and  planning,  visualization  of  flight  course,  estimation  of  speed 
and  distance,  and  division  of  attention.  In  advanced  training,  eliminations 
arc  most  frequently  said  to  occur  in  conjunction  with  deficiencies  in  judg¬ 
ment,  coordination,  memory,  visualization,  and  progress  in  developing 
skills.  This  list  differs  chiefly  from  that  for  primary  training  in  the  addi¬ 
tion  of  memory  and  the  loss  of  foresight  and  planning.  There  are  some 
differences  between  single-engine  and  twin-engine  training,  but  they  are 
of  uncertain  significance.  In  operational  training,  reclassified  pilots  most 
frequently  show  these  characteristics:  fear  and  apprehension,  lack  of 
progress,  lack  of  motivation,  and  lack  of  judgment.  The  chief  new  fea¬ 
ture,  then,  and  it  heads  the  list,  is  fear  and  apprehension. 


Table  1.6. —  Average  ratings  of  importance  of  psychological  categories  for  combat 

pilot  position j* 


Categories 


Speed  of  decisions  and  reaction  . . 

Judgment  . . 

Motivation  . . 

Emotional  control  . . 

Estimation  of  speed  and  distances 

Division  of  attention  . . 

leadership  . . . . . 

Dependability  . . 

Orientation  and  observation  .... 
Visualisation  of  the  flight  course  , 

Memory  . 

Coordination  . 

Mechanical  comprehension  . 

Serial  reaction  time  . 

Reading  comprehension . 

Arithmetic  reasoning  . . . 

Dial  and  table  reading  . 

Fincer  dexterity  . . 

Arithmetic  calculations  . 

Matbema.ics  . 


Ratings  by  supervisors 
of  combat  teams 


Fighter 

pilot 

Bomber 

pilot 

8.0 

7.2 

7.7 

7.3 

7.7 

6.4 

7.6 

7.3 

7.5 

6.1 

7.5 

6.8 

7.4 

5.9 

7.2 

6.5 

7.2 

5.S 

6.7 

6.4 

6.6 

6.4 

6.1 

6.0 

6.0 

6.0 

5.9 

5.9 

5.6 

5.7 

4.8 

4.7 

4.8 

5.6 

4.2 

5.0 

4.1 

4.5 

3.3 

3.9 

'  Paters  of  fighter-pilot  requirements  were  30  squadron  commanders  and  squadron  operations 
officers  in  the  European  theatre  of  operations.  Raters  of  bomber-pilot  requirements  were  117 
similar  officials. 


Table  1.7. —  Percentage  of  times  categories  were  mentioned  by  150  eliminated 


cadets  as  cause  of  elimination  (Pilot) 


Categories 

Percentage  of 
time  mentioned 

Categories 

Percentage  of 
time  mentioned 

Nervousness  in  the  sir  .... 

54 

Understanding  of  plane's 

53 

16 

IS 

Judgment  of  height-speed  in 

Judgment  . . . . 

30 

13 

13 

l-atk  *>f  “fret  nf  llif  *hip*1  . . 

27 

Erratic  performance . 

Inaipfoi'f *4tf  altitii'lfft  ..... 

26 

Flight  planning  and  pattern  . 

13 

J\ior  'ontfijl  of  lltc  *hi|>  in 

23 

20 

Mechanical  flying . 

9 

l.tMilins  . . 

Inadequate  correction  for 

9 

19 

9 

Sink  and  rudder  control  ... 

17 

In  comkit,  traits  rated  among  the  most  important  for  both  fighter  and 
l>omlxr  pilot  are:  Judgment,  motivation,  speed  of  decision  and  reaction, 
••motional  control,  and  division  of  attention.  Speed  of  decision  and  reac¬ 
tion  is  apparently  much  more  crucial  in  combat  than  in  training,  as  one 
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might  expect.  It  is  interesting  that  whereas  estimation  of  speed  and  dis¬ 
tance  is  given  high  place  for  the  fighter  pilot,  dependability  is  regarded 
more  important  for  the  bomber  pilot. 

Of  all  traits,  judgment  stands  out  as  being  most  persistent  and  univer¬ 
sal.  This  is  not  the  place  to  try  to  define  judgment  or  to  break  it  down 
psychologically.  In  the  minds  of  aviation  observers  it  undoubtedly  means 
a  great  variety  of  things.  At  best,  it  signified  good  or  bad  decisions 
(where  "good”  and  "bad”  mean  that  the  result  turned  out  well  or  did  not  t 
turn  out  well,  or  that  the  decision  was  or  was  not  what  the  observer  would 
have  done  under  similar  circumstances).  However  this  may  be,  the  fre¬ 
quent  mentjon  of  judgment  for  the  pilot,  and  for  other  air-crew  person¬ 
nel  as  well,  was  a  persistent  challenge  to  break  it  down  to  manageable  com¬ 
ponents  and  to  devise  tests  for  it* 

CONCLUDING  STATEMENT 

During  the  early  months  of  the  war,  at  least,  job-analysis  information 
from  all  known  sources  was  eagerly  grasped  and  exploited  for  what  it 
seemed  to  be  worth,  in  accordance  with  the  desperateness  of  the  situation. 

It  was  recognized  that  much  better  knowledge  was  needed  and  would 
probably  be  forthcoming  during  the  later  course  of  events.  From  the  early 
days,  when  even  anecdotal  material  was  tolerated,  and  informal  observa¬ 
tions  served  as  a  basis  for  test  ideas,  the  progress  in  job  analysis  was 
marked  by  a  transition  through  statistical  studies  of  quasi-standardized 
observations,  until  at  later  times  factor-analysis  methods  were  invoked 
to  study  job  criteria  as  well  as  tests.  Since  the  latter  type  of  results  can 
be  discussed  only  in  connection  with  tests,  and  these  need  to  be  described, 
an  account  of  such  results  will  be  reserved  for  later  pages  (see  ch.  28), 

It  will  be  seen  during  the  course  of  succeeding  chapters  how  well,  and  at 
times  how  poorly,  observations  of  jobs  yielded  useful  concepts  and  led 
to  tests  wrhich  did  or  did  not  measure  significant  aspects  of  air-crew 
aptitude. 


\ 

*  For  •  fuller  account  of  the  pilot  Me  report  No.  8  of  thU  aerie*. 
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CUAPTEB  TWO _ _ 

The  Program  of  Printed  Test 


Development1 


JOB  ANALYSIS  IN  RELATION  TO  THE  CONSTRUCTION 
OF  PRINTED  TESTS 

The  previous  chapter  discussed  various  sources  of  information  about 
pilots,  navigators,  and  bombardiers  that  were  available  to  guide  test  con¬ 
struction.  This  chapter,  which  discusses  the  printed-test  research  pro¬ 
gram,  starts  with  the  relationship  of  job-analysis  findings  to  test  construc¬ 
tion.  For  this  purpose,  it  is  convenient  to  distinguish  two  levels  of  job- 
analysis  information. 

Levels  of  Job-Analysis  Information 

Practically  all  job  descriptions  can  be  placed  in  two  categories.  Some 
do  not  go  beyond  a  description  of  what  the  worker  does.  Descriptions  of 
this  sort  might  legitimately  be  termed  “phenotypic”  descriptions.  They 
are  most  likely  to  lead  to  job-sample  tests.  In  thinking  of  the  job  of  the 
pilot,  for  example,  some  task  involving  a  stick  and  rudder  bar  is  immedi¬ 
ately  suggested.  Other  job  descriptions  attempt  to  describe  the  abilities 
used  by  the  worker  in  his  job.  Such  descriptions  are  more  taxing  psy¬ 
chologically  ;  i.  e.,  they  are  at  a  more  profound  level.  They  might,  there¬ 
fore,  be  termed  “genotypic"  descriptions.  They  are  likely  to  lead  to  tests 
of  functions  or  factors. 

Phenotypic  descriptions  and  work-sample  tests. — It  is  a  psychological 
truism  that  maximum  validity  for  a  single  test  for  any  criterion  can  usu¬ 
ally  be  obtained  by  means  of  a  work-sample  test.  The  reasons  for  this  arc 
not  hard  to  find.  The  work-sample  test,  insofar  as  it  is  a  true  sample  ot 
the  job,  will  contain  the  valid  factors  in  proportion  to  their  proper 
weighting  and  will  be  on  the  average  about  as  reliable  as  the  criterion.  It 
seems  obvious  that  this  procedure  will  be  most  successful  for  relatively 
simple  criteria. 

If  the  job  is  very  complex,  on  the  other  hand,  phenotypic  job  descrip¬ 
tions  lead  to  tests  sampling  segments  of  the  job.  If  table  reading  is  in¬ 
volved,  a  table-reading  test  is  constructed;  map-reading  activity  suggests 
a  test  of  map  reading,  etc.  When  such  tests  have  been  constructed,  how¬ 
ever,  the  usual  finding  is  that  their  correlations  with  each  other  are  high, 
so  that  the  multiple  correlation  derived  by  combining  several  such  tests 
will  be  little  higher  than  the  single  highest  validity  coefficient  in  the  group. 

‘Written  by  Capt,  Lloyd  G.  Humphrey*. 
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Work-sample  tests,  in  addition,  are  not  widely  useful  since  they  arc 
“tailored”  for  a  particular  criterion.  While  these  tests  have  not  been  over¬ 
looked  completely,  it  is  certainly  true  that  they  have  not  constituted  a 
major  emphasis  in  the  test  research  reported  in  this  volume. 

Genotypic  descriptions  and  tests  of  functions. — The  use  of  genotypic 
job  descriptions  has  been  limited  by  lack  of  knowledge  concerning  human 
traits  and  their  measurement.  Once  these  traits  have  been  defined — and 
the  factor-analysis  technique  gives  promise  of  greatly  facilitating  this 
step — tests  can  be  constructed  to  measure  the  separate  functions.  Al¬ 
though  considerable  progress  had  been  made  in  this  direction,  chiefly  due 
to  the  work  of  Thurstonc,  a  satisfactory  battery  of  tests  of  independent 
functions  or  factors  was  not  in  existence  at  the  outset  of  printed-test 
construction  in  the  Army  Air  Forces  Aviation  Psychology  Program. 

The  advantages  accruing  through  the  use  of  tests  of  independent  func¬ 
tions  arc  substantial,  particularly  in  a  classification  battery  where  a  test 
may  be  weighted  for  more  than  one  specialty.  Such  tests  arc  also  more 
flexible  if  criteria  change.  From  the  first,  therefore,  test  research  was 
oriented  toward  tests  of  important  functions.  Certain  functions  were 
deemed  to  be  important  in  early  job  analyses.  As  validation  studies  of 
classification  and  experimental  tests  became  available,  the  list  of  im¬ 
portant  functions  was  considerably  modified  and  somewhat  enlarged. 

Available  Job  Information 

For  reasons  discussed  in  the  following  section,  the  problem  of  selecting 
and  classifying  the  pilot  more  or  less  dominated  the  research  program 
from  the  first.  Concerning  the  pilot,  the  most  important  source  of  job  in¬ 
formation  available  at  the  beginning  of  research  with  printed  tests  was 
the  analysis  of  faculty  board  proceedings  discussed  in  chapter  1.  Com¬ 
ments  made  by  flying  instructors  concerning  reasons  for  elimination  of 
1,000  students  in  elementary  flying  training  constituted  the  basic  data. 
Psychological  analysis  of  these  comments  produced  a  list  of  20  traits  that 
were  presumably  important  in  pilot  success.  Xo  matter  how  keen  the 
analyst,  any  analysis  of  comments  made  by  psychologically  untrained  ob¬ 
servers  would  be  deficient,  because  the  basic  data  arc  not  completely 
sound.  Although  this  was  realized  from  the  outset,  this  list  of  20  traits 
constituted  almost  all  the  information  available  concerning  the  abilities 
necessary  in  learning  to  fly  “the  Army  way.”  It  should  be  noted  that  this 
list  oriented  the  research  program  from  the  beginning  towards  tests  of 
functions  or  factors. 

THE  PLAIN  Ol  TEST  DEVELOPMENT 
Importance  of  the  Analysis  of  Faculty  Hoard  Proceedings 

Although  faculty-board  proceedings  had  been  studied  only  for  pilots, 
the  organization  of  the  research  program,  as  well  as  the  planning  for 
printed-test  research,  was  based  on  the  analysis  of  those  proceedings.  This 
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was  the  result  of  several  circumstances.  In  the  first  place,  the  original  re¬ 
sponsibility  of  the  aviation  psychology  program  was  for  research  on  piloi 
selection;  responsibility  for  bombardiers  and  navigators  was  assumed 
somewhat  later.  In  the  second  place,  pilot  quotas  were  initially  so  large  n 
comparison  to  those  for  navigators  and  bombardiers  that  the  classification 
problem  was  largely  a  pilot-selection  problem.  In  addition,  a  satisfactory 
degree  of  validity  was  obtained  very  early  for  the  navigator  aggregate 
aptitude  score,  while  the  available  bombardier  criterion  had  so  little  reli¬ 
ability  that  research  concerning  bombardier  aptitude  was  almost  hopeless. 

Organization  of  the  research  program. — The  list  of  2C  traits  derived 
from  the  study  of  elimination  board  proceedings  was  divided  into  four 
main  categories :  Intellectual,  perceptual,  temperamental,  and  psychomo¬ 
tor.  Responsibility  for  test  research  was  originally  delegated  as  follows: 
Psychological  Research  Unit  No.  1,  temperament  tests;  Psychological 
Research  Unit  No.  2  and  the  Department  of  Psychology  of  the  School  of 
Aviation  Medicine,  psychomotor  tests ;  Psychological  Research  Unit  No. 

3,  intellectual  and  achievement  tests ;  and  the  Psychological  Section,  Head¬ 
quarters,  AAF  Training  Command,  perceptual  tests.  While  the  responsi¬ 
bility  for  test  development  in  these  areas  was  later  modified  in  several 
ways,  the  separation  of  tests  into  these  categories  continued  to  be  a  factor 
in  test  development  until  the  end  of  the  program.  If.  should  be  noted  that, 
since  the  concern  of  the  present  volume  is  with  printed  tests,  only  three 
of  the  four  categories  will  be  discussed.  Psychomotor  tests  constitute  the 
group  of  apparatus  tests  discussed  in  Report  No.  4  of  this  series. 

The  test  coding  system.— The  coding  system  established  for  the  test- 
research  program  was  based  u-xm  the  same  four  categories.  The  20  hy¬ 
pothesized  traits  of  unsuccessful  pilots  made  up  most  of  the  subcatcgones 
used  in  the  system.*  The  basic  code  number  for  a  test  begins  with  two 
letters  followed  by  three  digits  and  then  another  letter.  All  classification 
tests,  or  tests  designed  for  classification  purposes,  have  code  numbers  be¬ 
ginning  with  the  letter  "C"  The  second  letter  indicates  one  of  the  four 
main  categories:  I— Intellectual ;  P- Perceptual;  E  Temperamental ; 
and  M  — Pyschomotor.  The  first  digit  indicates  the  subarea  within  the 
main  area.  The  next  two  digits  indicate  different  tests  within  the  subarea. 
The  following  letter  indicates  different  revised  forms  of  the  same  test. 
This  basic  code  number  is  followed,  in  the  ease  of  tests  in  other  than 
final  form,  by  the  letter  "X.”  Successive  experimental  versions  of  the 
same  form,  therefore,  are  indicated  as  XI,  X2,  etc  Thus,  the  code  num¬ 
ber  CI206C  (Arithmetic  Reasoning)  means  tliat  the  test  was  designed  tor 
classification  purposes,  in  the  intellectual  area,  reasoning  subgroup,  and 
that  it  was  the  third  form  of  the  sixth  reasoning  test  to  be  given  a  code 

number. 

Plan  of  research  — The  original  plan  of  research  was  to  develop  one 
or  more  tests  in  each  of  the  subcategorics  of  the  coiling  system  for  va  t- 
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dation  and  possible  inclusion  in  the  classification  battery,  t  his  procedure 
was  not,  of  course,  deemed  to  be  a  permanent  solution  to  the  pilot-selec¬ 
tion  problem.  It  did  promise  to  give  initial  coverage  of  a  number  of  poten¬ 
tially  valid  factors.  It  was  expected  that  validation  findings  and  addi¬ 
tional  job  analyses  of  various  types  would  serve  as  the  primary  guide  in 
later  research. 


Importance  of  Validation  Studies 

The  importance  of  rapid  validation  of  tests  cannot  be  over-emphasized, 
either  in  the  research  program  in  the  Army  Air  Forces  or  in  any  selection 
program.  The  usefulness  of  any  job  analysis  and  subsequent  test  con¬ 
struction  is  determined  by  the  correlations  of  the  tests  with  criteria. 
Knowledge  of  the  criteria  used  is  necessary,  therefore,  in  order  to  evaluate 
the  statistics  concerning  individual  tests  to  be  reported  in  the  chapters  to 
follow. 

The  pilot  criterion. — The  criterion  of  success  as  a  pilot  routinely  used 
in  validation  studies  was  graduation  or  elimination  from -primary  flight 
training.  Most  eliminations  usually  occur  re.  during  primary  training.* 
A  smaller  proportion  of  students  was  eliminated  from  basic  training  and 
a  still  smaller  proportion  from  advanced  and  transitional.  In  all  three 
phases  the  great  majority  of  eliminations  was  for  flying  deficiency.4  Few 
eliminations  from  pilot  training  for  academic  deficiency  occurred  either  in 
the  ground-school  phase  of  flying  training  or  in  the  preflight  school. 

After  a  student  was  classified,  he  spent  2  months  each  in  preflight, 
primary,  basic,  and  advanced  training.  L'sing  the  criterion  of  eliminations, 
in  primary  training,  validity  data  matured  in  a  minimum  period  of  from 
2  to  5  months  depending  on  when  a  test  was  given.  When  a  classification- 
battery  test  was  to  be  validated,  a  period  of  approximately  5  months  was 
required.  Many  experimental  tests  were  also  given  during  the  classifica¬ 
tion  period  so  that  the  same  time  lag  existed  tor  them.  Other  experimental 
tests  were  given  to  classified  pilots  as  they  finished  preflight  training. 
Data  on  these  men  were-  then  available  in  2  months.  This  procedure  made 
jMissihle  quick  validation  of  many  cxjtcrimcntal  tests. 

The  navigator  criterion. — The  standard  criterion  of  success  as  a  navi¬ 
gator  was  graduation  or  elimination  from  advanced  navigation  training, 
the  only  navigation  phase  of  training  beyond  preflight.  The  important 
variables  entering  into  this  criterion  were  few  in  number.  These  were 
grades  in  navigation  theory,  ground  missions,  and  flight  missions,  of 
which  the  third  was  most  heavily  weighted.*  F.very  evidence  indicates  that 
this  criterion  was  quite  reliable. 

Because  of  the  small  proportion  of  students  classifies!  as  navigators, 
validation  analyses  for  navigation  were  almost  restricted  to  classification 
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tests.  Many  months  were  necessary  to  accumulate  as  many  as  1,000  cases 
of  classified  navigators  on  a  test  given  during  the  classification  period  at  a 
single  classification  center.  With  time  in  prefiight  and  advanced  naviga¬ 
tion  added,  validation  on  a  sufficiently  large  sample  of  a  test  for  the  navi¬ 
gator  criterion  took  approximately  one  year.  It  later  became  possible  to 
test  a  few  classes  of  classified  navigators  graduating  from  preflight  in  all 
three  flying  training  commands  with  small  batteries  of  experimental  tests. 

The  bombardier  criterion. — The  successful  bombardier,  for  validation 
purposes,  was  the  graduate  from  advanced  bombardier  school.  Graduation 
or  elimination  was  largely  determined,  in  turn,  by  the  "average-circular- 
error”  and  "percent-hits”  scores  obtained  on  practice  bombing  missions. 
The  instructor’s  judgment  concerning  a  student’s  capability  as  a  bombar¬ 
dier  also  entered  into  the  decision  to  graduate  or  eliminate,  but  in  a  non- 
systematic  fashion.  Since  the  objective  measures  of  bombardier  profi¬ 
ciency.  i.  e.,  circular  error  and  percent  hits  during  individual  training,  are 
known  to  have  had  practically  zero  reliability,  any  reliability  in  the  gradu¬ 
ation-elimination  criterion  was  probably  due  to  the  subjective  judgments  , 
of  instructors.  That  the  bombardier  criterion  did  have  some  degree  of 
reliability  is  shown  by  the  consistent  positive  correlations  obtained  be¬ 
tween  certain  tests  and  that  criterion.* 

The  same  comments  made  concerning  the  relatively  small  number  of 
classified  navigators  also  apply  to  l»ombardicrs.  Adequate  samples  were 
difficult  to  obtain  on  tests  other  than  those  in  the  classification  battery 
until  a  few  classes  of  preflight  graduates  were  tested  with  small  batteries 
of  experimental  tests.  The  problem  was  made  even  more  complicated  by 
the  unreliability  of  the  criterion.  If  the  top  possible  correlation  between 
a  test  and  a  criterion  is,  for  example,  0.30,  one  cannot  be  reasonably  cer¬ 
tain  that  any  correlation  at  all  exists  unless  very  large  numbers  of  cases 
arc  available. 

Test  Construction  by  Subareas 

In  order  to  carry  out  the  plan  to  construct  and  validate  at  least  one  test 
in  each  subarea,  the  problem  immediately  arose  as  to  when  a  test  did  or 
did  not  measure  any  hypothesized  ability.  The  first  step  is  an  obvious  one. 
If  one  cannot  lx?  certain  that  a  given  test  is  a  good  measure  of  the  ability, 
a  number  of  tests  should  be  constructed  in  the  subarea  and  experimentally 
administered.  In  selecting  representative  tests  of  the  ability,  reliability  is 
a  possible  criterion.  Within  rather  wide  limits,  however,  reliability  was 
considered  to  be  relatively  unimportant.  Much  more  important  were  the 
intercorrelations  of  the  experimental  tests  and  their  correlations  with 
tests  then  in  the  classification  battery.  The  technique  of  factor  analysis, 
which  is  best  described  as  an  extension  of  correlational  analysis,  was 
therefore  considered  to  be  an  important  aid  in  selecting  tests  to  measure 
the  ability  in  question. 

•  F*r  «  *>«rt  (taplrlt  diKtt*»*oo  •(  tS«  UnUrdKi  criteria*.  Kt  Srpacl  K*.  »  *t  lSi»  MlW 
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USE  OF  CORRELATIONAL  AND  FACTOR  ANALYSES 
IN  TEST  CONSTRUCTION 

Determining  Uniqueness  of  Contribution 

It  is  becoming  increasingly  evident  that,  in  addition  to  the  concepts  of 
reliability  and  validity,  the  concept  of  uniqueness  of  contribution  or 
purity  deserves  a  central  place  in  test  construction  theory.  When  one  is 
faced  with  the  practical  problem  of  putting  together  a  battery  of  tests  to 
predict  a  criterion,  individual  test  reliabilities  and  validities  shrink  in  im¬ 
portance.  Beta  weights,  which  are  a  function  of  test  intercorrelations  as 
well  as  validities,  become  the  criteria  on  which  a  test  is  accepted  or 
rejected. 

Relationship  to  correlational  analysis. — If  a  test  contributes  informa¬ 
tion  concerning  individual  differences  over  and  above  that  furnished  by 
a  battery  of  other  tests,  that  fact  can  be  ascertained  through  correlational 
analysis  alone.  The  multiple  correlation  between  the  test  and  a  reference 
battery,  when  corrected  for  attenuation,  must  differ  significantly  from 
1  if  the  test  is  to  make  a  real  contribution.  This  contribution  consists  of 
the  measurement  of  a  new  function  or  functions. 

Relationship  to  factor  analysis. — Correlational  analysis  alone  is  suffi¬ 
cient  to  assess  a  test’s  contribution  to  a  battery.  Factor  analysis  is  neces¬ 
sary  in  order  to  define  the  nature  of  that  contribution.  While  the  objec¬ 
tivity  of  the  application  of  factor  analysis  to  this  and  similar  problems 
may  have  been  overrated,  the  usefulness  of  the  technique  definitely  has 
not.  Factor  results  constitute  an  indispensable  aid  to  the  test  constructor 
who  is  interested  in  what  his  tests  measure  and  why  they  arc  valid.  One 
very  important  use  is  to  gain  insight  into  the  functions  responsible  for 
I iet a  weights  in  regression  equations. 

In  deciding  which  tests  in  a  group  designed  to  measure  "foresight  and 
planning,”  for  example,  were  most  worth  validating,  factor  analysis  was 
a  considerable  aid.  A  supposed  foresight -and-planning  test  may,  for  ex¬ 
ample,  turn  out  to  be  functionally  very  like  the  Arithmetic  Reasoning 
Test  already  in  the  classification  battery.  No  matter  how  different  the 
apparent  content  of  the  two  tests  may  bo,  the  experimental  test  could  not 
have  a  high  priority  for  validation.  A  second  forcs;ght-and-planning  test, 
on  the  other  hand,  may  reliably  define  a  new  factor.  Whether  or  not  the 
new  factor  should  now  be  given  the  name  of  the  hypothetical  function  it 
was  designed  to  measure  is  not  always  determinable.  The  test  which  best 
measures  the  factor,  however,  should  certainly  be  validated. 

A  Guide  to  Test  Construction 

In  a  previous  section,  it  was  stated  that  validation  findings  were  ex¬ 
pected  to  guide  test  construction  beyond  the  initial  stages  that  resulted 
from  the  available  information  concerning  the  jobs  of  the  pilot,  naviga¬ 
tor,  and  bombardier.  This  turned  out  to  be  only  partially  true.  Validation 
of  a  relatively  few  tests  will  usually  U-  a  sufficient  guide  to  the  construction 


of  other  valid  tests.  Beta  weights  of  the  additional  tests,  however,  may 
not  differ  significantly  from  zero.  For  this  v cason,  correlational  and  factor 
analysis  became  as  important  as  validation  findings  in  the  guidance  of  test 
construction. 

Increasing  unique  contribution. — Factor  analysis  senes  a  very  useful 
function  in  pointing  out  the  ways  in  which  the  unique  contribution  of  a 
given  test  can  be  increased.  For  example,  a  new  factor  is  discovered  in 
an  analysis  on  which  no  test  has  a  loading  greater  than  0.40.  The  test 
with  the  highest  loading  on  the  factor  also  has  high  loadings  on  the  verbal 
and  numerical  factors.  The  first  step  is  to  form  an  hypothesis  concern¬ 
ing  the  nature  of  the  new  factor.  Equally  important  is  to  decide  what 
features  of  the  test  contribute  toward  the  verbal  and  numerical  loadings. 
The  second  step  is  to  vary  the  content  of  the  test,  the  directions,  the 
method  used  in  recording  the  answers,  or  the  time  limit  so  that  the  verbal 
and  numerical  loadings  will  be  decreased  and  the  loading  on  the  new 
factor  will  be  maximized.  The  new  test  is  then  administered  along  with 
selected  reference  tests  in  order  to  check  the  factorial  make-up  of  the 
revision.  This  process  may  be  continued  until  satisfactory  results  are 
obtained. 

The  need  for  new  test  construction  is  indicated  by  factor-analysis 
findings  in  yet  another  way.  A  test  with  good  reliability  may  show  very 
little  communality  with  the  rest  of  a  test  battery.  It  is  relatively  easy 
in  most  cases  to  convert  a  nonerror  specific  factor  to  a  common  factor 
by  appropriate  test  construction.  This  is  particularly  important  if  the 
lest  is  known  to  have  validity  for  some  specialty  over  and  above  that 
predictable  from  its  known  common-factor  content.  In  this  connection, 
it  should  be  pointed  out  that  the  prediction  of  test  validities  on  the  basis 
of  a  summation  of  products  of  test  loadings  and  criterion  loadings  on 
known  factors  has  been  quite  successful.  The  evidence  for  this  will  be 
discussed  in  considerable  detail  in  chapter  28. 

Empirically  derived  categories. — Factor  analysis  promises  to  furnish 
the  test  constructor  empirically -derived,  orthogonal  categories  for  his 
tests.  Considerable  progress  has  been  made  in  establishing  these  cate¬ 
gories,  both  by  civilian  and  military  psychologists.  Empirically-derived 
categories  are  most  useful  to  the  test  constructor  in  conjunction  with  job 
analyses.  The  job  analyst,  in  using  factor  results,  has  a  framework  for 
his  description  of  the  job.  The  factor  categories,  in  addition,  direct  the 
analyst's  observations  toward  details  of  the  job  that  might  easily  go  un¬ 
noticed  otherwise. 

Scoring  Formulae  In  Relation  to  Factor  Finding* 

There  are  several  approaches  to  the  development  and  use  of  scoring 
formulae  for  tests.  All  arc  represented  in  the  tests  discussed  in  this 
volume.  The  final  practice,  which  is  recommended  grew  out  of  correla¬ 
tional  and  factor  studies. 
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A  priori  formulae. — On  the  basis  of  the  ldom-gucssing  hypothesis, 
the  probability  of  obtaining  a  correct  answer  by  chance  is  0.5  in  a  2- 
choicc  test,  0.33  in  a  3-choice  test,  etc.  The  formula  R—lV/(k—l),  where 
k  is  the  number  of  alternative  responses,  is  expected  to  convert  a  chance 
score  to  zero.  A  formula  of  this  type  has  been  very  commonly  used,  even 
in  power’  tests,  although  it  may  have  little  empirical  justification.  This 
practice  has  one  nnnstatistical  advantage — both  examinees  and  psycho¬ 
logically  unsophisticated  critics  can  be  told  that  even  though  the  right 
answer  can  be  guessed  in  a  multiple-choice  test,  guessing  will  not  be 
profitable. 

Maximum-reliability  formulae. — Right  and  wrong  scores  on  a  test  can 
be  weighted  so  that  a  maximum  degree  of  reliability  is  obtained.  This 
should  not  be  done,  however,  unless  it  is  known  thru  .ighl  and  wrong  re¬ 
sponses  arc  both  measuring  the  same  ining.  If  the  factor  patterns  of 
rights  and  wrongs  are  identical,  then  the  maximum-reliability  scoring 
formula  will  be  identical  with  the  maximum-validity  formula  discussed 
in  the  next  section. 

Maximum-validity  formulae. — When  a  test  is  being  considered  in  iso¬ 
lation,  a  maximum-validity  formula  will  be  found  to  be  most  useful. 
The  formula  which  maximizes  the  correlation  between  a  test  and  a  given 
criterion  may  not  be  the  same,  however,  for  a  different  criterion.  It  is 
conceivable  that  a  test  would  have  as  many  scoring  formulae  as  there  are 
criteria  that  it  is  used  to  predict,  if  right  and  wrong  scores  actually 
measure  different  functions.  Right  and  wrong  scores  are  very  likely  to 
be  factorial!)*  dissimilar,  as  a  matter  of  fact,  in  any  speeded  test.  A  number 
of  cases  will  lx-  presented  in  the  chapters  to  follow  in  which  this  is  true. 

Use  of  riijht  and  ivroiiy  scores  separately. — The  finding  that  rights  and 
wrongs  often  measure  different  functions  came  late  in  printed-test  re¬ 
search.  As  a  result,  the  procedure  that  is  now  recommended  has  be<jn 
followed  in  relatively  few  test  analyses.  It  now  seems  clear  that  the  best 
way  to  handle  right  and  wrong  scores  is  to  treat  them  as  separate  vari¬ 
ables  in  test  validation  and  analysis.  A  scoring  formula  should  not  be 
used  in  a  classification  battery  except  in  rare  cases  because  beta  weights 
for  rights  and  wrongs  may  differ  from  one  criterion  to  another.  Reten¬ 
tion  and  weighting  of  either  score  in  the  final  battery  should  depend 
upon  the  respective  beta  weights  determined  from  the  matrix  of  inter- 
correlations'  of  the  entire  battery. 

TYPICAL  HISTORY  OF  A  TEST 

Tin-  following  outline  of  the  typical  developmental  history  of  a  test  docs 
not  cover  all  tests  in  this  volume.  It  applies  to  intellectual  and  perceptual 
tests  much  more  than  to  temperament  tests.  It  is  perhaps  more  ideal 

1  In  •  po»er  let!,  wtUn  there  it  in  un^iunl  ttumUr  of  the  correlation  between 
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than  typical,  but  the  writer  is  convinced  that  if  the  better  aspects  of  this 
general  procedure  had  been  followed  more  religiously,  test  research  would 
have  been  even  more  productive. 

Choice  of  tlie  Function  to  be  Investigated 

In  the  early  davs  of  printed-test  construction,  subareas  of  research 
were  determined  largely  by  the  analysis  of  faculty  board  proceedings 
which  was  in  turn  reflected  in  the  coding  system.  A  representative  test 
or  tests  in  each  subarea  was  desired.  Higher  headquarters  indicated  the 
order  in  which  these  tests  were  to  be  supplied.  A  good  account  of  the 
development  of  tests  in  a  subarea  according  to  this  plan  is  given  in 
chapter  9,  Foresight  and  Planning  Tests. 

Later,  the  decision  to  investigate  a  given  function  frequently  arose 
from  a  combination  of*' validation  and  factor-analysis  findings.  An  ex¬ 
ample  of  this  sort  is  found  in  chapter  10,  Integration  Tests. 

Test  ideas. — 'After  the  function  to  be  investigated,  e.  g.,  foresight  and 
planning,  had  been  selected,  the  personnel  assigned  to  test  construction 
spent  a  period  of  time  reading  available  job  descriptions,  interviewing 
flying  personnel,  discussing  the  problem  among  themselves,  etc  Any¬ 
thing  that  might  lead  to  a  likely  test  idea  was  investigated. 

As  test  ideas  were  originated — and  they  often  multiplied  in  a  re¬ 
markable  fashion — those  responsible  were  asked  to  enlarge  upon  them, 
to  write  tentative  directions,  to  outline  a  few  items,  and  to  suggest  the 
conditions  for  administration.  At  this  stage,  a  weeding  process  was  re¬ 
quired.  In  the  absence  of  the  completed  test,  and  therefore  any  data, 
this  process  had  to  be  based  upon  professional  judgment  alone.  Rarely, 
however,  was  the  selection  of  an  idea  for  further  development  the  result 
of  only  one  individual’s  judgment.  In  most  cases,  and  ideally,  this  was 
the  result  of  joint  discussion.  The  chief  criterion  was  the  possibility  of 
unique  contribution.  Potential  reliability,  testing  time,  adaptability  to  IBM 
answer  sheets,  and  "face  validity”*  were  other  criteria  used. 

The  available  test  ideas  in  a  restricted  area  were  thus  reduced  to  a 
number,  such  as  eight,  that  could  be  easily  administered  together  for  in- 
tcrcorrclational  purposes.  The  planning  of  work  on  the  selected  tests  was 
oriented  from  the  start,  therefore,  toward  bringing  the  entire  group  to 
completion  at  approximately  the  same  time. 

Item  Writing  and  Criticism 

Tlw re  is  an  old  saving  that  two  heads  arc  better  than  one.  Fxpcricnce 
has  shown  that  this  is  true  in  test  construction.  Whenever  possible,  two 
men  were  assigned  to  the  development  of  a  single  test,  one  with  primary 
responsibility,  the  other  with  immediate  supervisory  functions  possibly 
including  one  or  two  other  similar  tests.  These  two,  working  closely  to¬ 
gether,  produced  the  experimental  version  of  the  test. 

•F«t  '.U.aTlr  rtfrtt  to  iKc  tt.arocftri.tic  of  ■  tut  It 
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Before  a  lest  was  produced  for  experimental  administration,  it  was 
gone  over  carefully  by  several  independent  critics.  This  step  involved 
more  than  mere  copy  reading.  Fundamental  conceptions  of  the  charac¬ 
ter  of  the  test  were  frequently  questioned.  The  joint  contribution  of 
several  capable  individuals  more  often  than  not  was  superior  to  what 
any  one  alone  could  produce. 

Experimental  Administration  and  Item  Analysis 

Wherever  possible,  experimental  forms  of  a  test  were  administered  in 
advance  of  the  proposed  correlational  study.  This  was  done  in  order  to 
check  the  clarity  of  the  directions,  other  problems  of  administration,  and 
internal  consistency.  Ttsts  with  relatively  complicated  directions,  prob¬ 
lems  of  answer-sheet  marking,  etc.,  might  go  through  four  or  five  forms, 
each  with  experimental  tryout  on  small  numbers  of  cases,  before  item 
analysis  was  undertaken. 

Item  analysis  was  considered  to  be  a  very  important  tool.  Experimen¬ 
tal  forms  of  a  test  were  almost  uniformly  made  long  enough  that  con¬ 
siderable  item  selection  might  be  done.  Tests  with  high  internal  con¬ 
sistency  were  desired  for  factor-analysis  purposes  and  for  potential  in¬ 
clusion  in  the  classification  battery.  It  was  realized  *hat  this  was  not 
necessarily  the  best  way  to  maximize  the  validity  of  the  individual  test. 
Maximum  validity  for  a  single  test  was  neither  necessary  nor  desirable, 
however,  since  maximum  validity  of  the  battery  of  tests  was  the  goal. 

It  should  be  emphasized  that  high  internal  consistency  was  desired  for 
more  than  reliability  alone.  For  one  thing,  items  that  have  low  cor¬ 
relations  with  the  total  score  of  which  they  are  a  part  are  not  necessarily 
unreliable  items.  A  low  correlation  with  total  score  often  indicates  that 
the  item  measures  some  other  function  than  that  measured  by  the  rest 
of  the  test.  High  internal  consistency  was  desired  because  it  increased  the 
chances  of  obtaining  a  pure  test.  Items  of  low  internal  consistency  with 
promise  of  validity  posed  a  problem  for  additional  test  construction,  that 
of  finding  a  test  in  which  they  would  belong. 

Item  analyses  were  used  in  ways  other  than  for  item  selection.  A 
considerable  amount  of  item  revision  often  occurred  at  this  stage  in  the 
development  of  the  test.  The  item  analysis  not  only  furnished  the  cor¬ 
relation  In-tween  item  and  total  score  on  the  test,  but  it  also  furnished  in¬ 
formation  concerning  difficulty  levels,  functioning  of  misleads,  and  the 
extent  to  which  the  test  was  speeded. 

Correlational  ami  Factor  Analysis 

After  item  selection  and  revision  had  been  accomplished,  time  limits 
revised,  and  directions  given  a  final  polishing,  all  the  tests  in  the  subarea 
were  prepared  for  correlational  administration.  This  administration  often 
involved  difficulties  that  could  not  always  be  overcome.  No  formal  pre¬ 
vision  had  been  made  for  such  testing.  One  or  two  experimental  tests 
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could  be  given  along  with  the  classification  battery,  but  there  was  not 
sufficient  time  to  give  an  experimental  battery.  The  time  of  the  aviation 
student,  both  during  the  classification  process  and  during  preflight  train¬ 
ing,  was  rather  closely  scheduled.  The  most  desirable  group  would  have 
been  composed  of  unclassified  students.  Often,  however,  the  only  stu¬ 
dents  available  for  extra  exjK-rimental  testing  had  already  been  classified. 
In  certain  factor  analyses  to  be  reported  later  in  the  volume,  based  on 
classified  students,  the  results  are  somewhat  biased  as  compared  to  those 
that  would  have  been  obtained  if  unclassified  students  had  been  used. 

No  matter  what  the  source  of  subjects  happened  to  lx*  for  a  given 
analysis,  classification-test  scores  were  always  available  from  the  regular 
administration.  A  selection  of  the  best  known  of  these  was  made  for 
inclusion  in  the  matrix  of  correlations,  to  serve  as  reference  tests.  This 
procedure  insured  that  certain  know?*  factors  would  be  included  in  the 
analysis  and  would  be  readily  identified. 

It  was  at  this  stage,  also,  that  reliabilities  were  usually  computed.  The 
principal  use  to  which  reliability  estimates  were  put,  as  a  matter  of  fact, 
was,  in  comparison  with  commonalities,  to  obtain  an  indication  of  the 
amount  of  nonerror  specific  variance  in  a  test. 

Validation 

On  the  basis  of  the  data  accumulated  in  the  preceding  stages,  a  rank¬ 
ing  was  made  of  the  experimental  tests  with  regard  to  their  desirability 
for  immediate  validation.  Promise  of  unique  contribution  was,  of  course, 
the  chief  criterion  employed.  Such  a  ranking  was  necessary  because  dur¬ 
ing  most  of  the  period  covered  by  test  research  the  amount  of  testing 
time  allotted  for  experimental  testing  was  limited.  To  obtain  as  many  as 
a  thousand  unclassified  aviation  students  on  every  test  was  impossible, 
and  samples  of  this  size  were  barely  sufficient  for  pilot  validation  only. 
The  number  of  tests  validated  was  increased  sharply  during  brief  periods 
when  prefiight  graduates  were  tested. 

CONCLUSIONS 

In  this  chapter  a  generalized  picture  was  sketched  of  printed-test  con¬ 
struction.  It  was  seen  that  from  the  first,  te.  1  cons'  nation  was  oriented 
toward  the  development  of  tests  of  functions  or  factors  rather  than  to¬ 
ward  job-sample-type  tests.  This  stemmed  from  the  analysis  of  Faculty 
Board  proceedings  which  was  couched  in  terms  of  traits  of  unsuccessful 
pilots.  This  analysis  of  the  important  traits  necessary  in  learning  to  fly 
“the  Army  way”  has  been  considerably  modified  and  enlarged  by  sub¬ 
sequent  factor-analysis  findings.  As  validation  findings  and  factor-anal¬ 
ysis  results  became  available,  the  direction  of  test  research  became  pro¬ 
gressively  less  influenced  by  job-analysis  information. 

The  importance  of  constant  and  rapid  validation  of  experimental  tests 
was  stressed.  As  a  basis  for  evaluating  the  test  validities  to  lx*  presented 
in  later  chapters,  the  pilot,  navigator,  and  bombardier  criteria  were  briefly 
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discussed.  Reasons  for  the  concentration  of  test  research  in  the  pilot 
area  were  also  discussed.  These  are,  in  brief,  as  follows:  The  greater 
case  and  promptness  of  validation  against  the  pilot  criterion ;  the  import¬ 
ance  of  the  pilot  problem  as  a  function  of  initial  low  validity  in  this 
area  and  large  quotas ;  the  initial  high  validity  for  tests  against  the  navi¬ 
gator  criterion;  and  the  lack  of  reliability  of  the  bombardier  criterion. 

The  final  section  discussed  the  typical  history  of  an  aptitude  test,  pro¬ 
ceeding  from  selection  of  the  subarea,  formation  of  test  ideas,  and  item 
writing  and  criticism,  through  experimental  administration,  item  analysis, 
and  correlational  and  factorial  analysis,  to  final  validation. 
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CHAPTER  THREE 


Commonly  Used  Statistical 
Procedures1 


Most  of  the  steps  in  the  typical  history  of  a  test  discussed  in  the  pre¬ 
ceding  chapter  involve  statistical  computations  of  one  kind  or  another. 
Report  No.  3  of  this  series  describes  statistical  techniques  employed  in 
all  aspects  of  the  AAF  Aviation  Psychology  program,  so  it  is  unneces¬ 
sary  to  go  into  detail  here-  regarding  those  techniques.  Certain  tech¬ 
niques  were  selected  as  standard  for  use  in  the  development  of  printed 
tests,  however,  and  so  it  is  desirable  to  set  forth  an  account  of  the  adapta¬ 
tion  of  those  particular  methods — to  account  for  the  choice  of  methods, 
to  mention  any  special  variation  of  them  (for  there  were  some),  and  to 
set  down  conclusions  based  upon  extensive  experiences  with  them.  This 
chapter  will  also  serve  the  purpose  of  explaining  the  nature  of  most 
tabular  material  in  the  chapters  that  follow,  as  well  as  the  nontabular 
statistics  used  in  describing  tests. 

RELIABILITY 

Reliability  has  usually  been  defined  as  the  correlation  between  com¬ 
parable  or  interchangeable  measures  of  the  same  thing.  Other  than  to 
point  out  that  the  use  of  the  singular  word  "thing”  may  legitimately 
cover  a  factorially  complex  test — that  is,  comparability  does  not  imply 
item-for-item  correspondence  within  a  test,  but  merely  from  one  form  to 
the  other — one  does  not  need  to  amplify  this  definition  in  any  way.  Reli¬ 
ability  as  thus  defined  is  a  useful  concept  in  test  analysis.  In  most  cases, 
also,  the  definition  unequivocally  suggests  the  appropriate  technique 
of  estimation. 

Correlation  between  Comparable  Forms 

The  technique  of  reliability  estimation  that  has  been  most  commonly 
used  in  printed-test  development  is  a  part  I-part  II  correlation.*  It  in¬ 
volves  computing  the  correlation  between  separately  timed  but  com¬ 
parable  parts  of  a  single  test  printed  within  a  single  booklet  and  ad¬ 
ministered  in  immediate  succession.  This  procedure  differs  from  the 
usual  one  involving  comparable  forms  in  two  particulars:  (1)  Compar¬ 
able  forms  arc  usually  printed  as  separate  booklets,  and  (2)  are  usually 
administered  with  a  time  interval  between  them.  Some  test  technicians 

1  Written  by  Capt.  Lloyd  G.  Humphrey*. 

*  !n  the  tables  of  this  volume,  thU  is  referred  to  as  an  alternate-form*  type  of  reliability. 
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lielicve  that  the*  intervening  lime*  interval  is  desirable,  since  it  would  pre¬ 
sumably  take  into  account  function  fluctuation  within  the  individual  from 
lime  to  time,  as  well  as  function  fluctuation  during  the  course  of  the 
test.  Data  are  available,  however,  on  four  rather  different  tests,  which 
show  that  the  reliability  estimate  is  not  significantly  affected  by  the 
difference  Ik- tween  immediate  and  somewhat  delayed  administration  of 
the  second  part. 

These  four  te>ts  were  administered  in  separately  timed  halves,  and 
with  two  time-interval  conditions.  In  the  first  condition,  the  second  half 
was  administered  immediately  after  the  first.  In  the  Second  condition, 
about  4  hours  of  time  and  approximately  half  the  tests  in  the  group-test 
classification  battery  intervened  between  the  2  halves.  The  tests  were  se¬ 
lected  as  ones  thought  likely  to  show  a  decrement  in  reliability  after  an 
interval,  if  such  a  decrement  does  indeed  occur.  The  tests  chosen  (a) 
were  speeded  and  ( h )  called  for  a  rather  complex  and  novel  task,  in¬ 
volving  extensive  instructions.  The  tests  were  administered  in  pairs.  A 
given  group  received  one  test  of  the  pair  without  appreciable  interval 
and  the  other  with  the  4-hour  interval,  anti  then  the  conditions  were  re¬ 
versed  for  the  next  group.  Approximately  1,000  cases  were  tested  with 
each  pair  of  tests,  500  in  each  sequence.  The  results  are  shown  in  table 
3.1. 


Taw.e  J.l. — F.xfcrmental  test-reliabilities  uith  and  without  time  interval  between 


farts  l  and  II 


Teal* 

Statistic 

Preaviation  cadets  only 

Preaviation  cadets  plus 
airplane  mechanics 

Interval 

No  interval 

Interval 

No  interval 

Decoding,  CI214AX2  . 

N 

238. 

35$. 

426. 

439. 

Mi 

12.33 

10.76 

10.47 

M. 

It.  20 

HdITI 

12.31 

12.48 

S.D.i 

6.34 

6.40 

6.39 

S.D.s 

6.60 

6.64 

6.79 

6.85 

fl1 

.58 

.58 

.64 

.63 

E'timation  uf  I-nigth.  CP63IA 

N 

355. 

238. 

439. 

42$. 

Mi 

16.53 

18.58 

16.11 

|7  66 

M. 

11.61 

12.50 

11.62 

11.97 

S.D.i 

7.22 

7.48 

7.10 

7.3$ 

S.D.i 

6.64 

6.51 

6.50 

6.46 

ni 

.41 

.40 

,4C 

.43 

Olijevl  Identification,  CHilA 

.  ki 

524. 

193. 

448. 

•*wt»rl 

46.94 

48,13 

41.02 

42.64 

41.39 

36.09 

14.74 

16.19 

18.12 

S.D.i 

I2.SS 

12.71 

13.96 

n» 

.60 

.65 

.64 

Vi-ualiraticin  of  Maneuvers, 

N 

193. 

525. 

448. 

t'l6'7C'Xl  . 

Mi 

16  12 

20  36 

11  72 

M, 

17.79 

19.7$ 

12.35 

S.D., 

10.02 

10  74 

10.15 

S.D.i 

10.85 

11.20 

11.13 

Hi 

.82 

.15 

.84 

1  *-'ur  ilr«n|Hion  of  iheve  tr>t*  w»  chafer  7,  Kri'onm*  loli;  chapter  IS,  Sii*  and  Dis¬ 
tance  F.Mimativn  Te>4»;  chapter  19,  Spatial  Tecta;  ami  chapter  12,  Visualiialion  Ttj'.k 


The  intervening  time  interval  anti  activities  ir  this  study  arc  thus  seen 
to  have  no  measurable  effect  on  reliability  estimates.  While  it  is  possible 
that  a  longer  delay,  or  other  types  of  activities  might  produce  such  an 
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effect,  it  should  he  noted  that  the  delay  and  activities  chosen  represent  the 
typical  testing  situation  for  correlational  studies. 

Advantages  of  the  part  I — part  II  technique. — Exj)ericncc  has  shown 
that,  for  most  tests,  reasonably  comparable  forms  or  parts  can  be  con¬ 
structed  without  the  use  of  elaborate  trial  forms  and  statistical  analyses. 
Sophisticated  inspection  of  the  items  placed  in  the  two  parts,  if  followed 
by  a  comparison  of  the  two  means  and  standard  deviations,  is  usually 
a  sufficiently  rigorous  technique.  The  labor  involved  in  constructing  two 
forms  or  parts  is  thus  not  excessive;  printing  in  a  single  booklet  re¬ 
duces  cost  and  inconvenience  in  administration;  and  having  separately 
timed  parts  makes  the  method  applicable  to  speed  tests  as  well  as  power 
tests.  As  a  matter  of  fact,  this  is  the  only  satisfactory  method  applicable 
to  both  speed  and  power  tests. 

\ 

Odd-Even  Estimate* 

In  a  few  cases  odd-even  estimates  of  reliability  were  the  only  ones 
available,  even  on  highly  speeded  tests.  These  are,  of  course,  over-esti¬ 
mates  of  the  reliability  of  speed  tests.  It  is  not  so  generally  realized,  how¬ 
ever,  that  odd-even  coefficients  may  underestimate  the  reliability  of  a 
power  test,  particularly  if  the  test  contains  a  small  number  of  items,  and 
if  the  test  items  measure  different  factors.  When  such  reliabilities  are 
presented,  attention  is  called  to  their  deficiency. 

Use  of  the  Spearman-Brown  Formula 

\\  hen  the  two  parts  correlated  are  truly  comparable,  i.  c.,  when  the 
product-moment  correlation  between  paired  items  is  1.00  when  corrected 
for  attenuation,  the  Spearman-Brown  correction  gives  a  correct  state¬ 
ment  of  the  reliability  of  the  two  parts  combined.  The  formula  has  been 
applied,  however,  in  a  number  of  cases  where  the  two  parts  were  not 
completely  comparable.  If  the  standard  deviations  of  the  two  parts  arc 
not  equal,  application  of  the  formula  results  in  a  slight  underestimation 
of  the  reliability  of  the  entire  test.  l.ack  of  comparability  of  subject  mat¬ 
ter  may  result  in  grosser  underestimates.  Use  of  the  Spearman- Brown 
formula  will  result  in  overestimates,  on  the  other  hand,  when  errors  of 
measurement  are  correlated. 

Uses  for  Reliability  Estimates 

Reliability  estimation  is  not  an  end  in  itself.  In  a  selection  program 
one  should  be  concerned  about  errors  of  measurement  only  as  they  affect 
validity.  In  a  battery  of  tests  it  is  usually  more  profitable  to  add  a  test  of 
a  new,  valid  function  than  to  increase  the  length,  and  therefore  the  re¬ 
liability,  of  one  of  the  tests  already  in  use. 

It  is  useful,  on  the  other  hand,  to  have  a  reliability  coefficient  in  ana¬ 
lytical  work  with  tests.  Does  the  correlation  between  tests  A  and  B 
represent  all  of  their  nonchance  variances?  How  much  would  the  validity 
of  test  A  be  increased  if  it  were  doubled  in  length?  In  any  given  factor 
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analysis  do  a  test's  factor  loadings  account  for  all  of  its  nonchancc  vari¬ 
ance?  Questions  of  this  type  can  be  answered  knowing  the  correlation 
between  comparable  forms  of  a  test.  The  greater  the  complexity  the 
lest,  the  more  important  it  is  in  answering  these  questions  to  have  item- 
for-item  correspondence  in  the  two  forms,  and  the  greater  is  the  error 
involved  in  using  any  other  estimate  of  reliability. 

t 

INTERNAL  CONSISTENCY 

Although  reliability  necessarily  increases  with  increasing  internal  con¬ 
sistency  of  items,  a  reliable  test  is  not  necessarily  internally  consistent. 
It  is  possible  to  have  a  perfectly  reliable  test  with  zero  correlations  among 
its  items,  i.  e.,  with  zero  internal  consistency.  This  fact  demonstrates  the 
need  for  two  concepts,  and  two  terms  in  this  area. 

Kudcr-Richardson  Formulas 

Of  the  Kuder- Richardson  formulas  ( 8 )  the  one  most  widely  used  on 
tests  in  this  volume  is  their  No.  21,  which  involves  the  mean  difficulty 
level  of  all  of  the  items  in  the  test.  If  the  items  do  not  vary  widely  in 
difficulty  level,  the  error  involved  in  not  using  the  mere  accurate  formula 
No.  20  is  not  great.  With  a  wide  range  of  item  difficulties  the  latter  for¬ 
mula  is  sufficiently  precise.  It  makes  the  same  assumptions  as  the  analysis 
of  variance,  and  in  fact,  is  algebraically  equal  to  Hoyt’s  formula  (5) 
when  the  latter  is  applied  to  a  test  consisting  of  unit-weighted  items. 

Uses  for  Internal-Consistency  Coefficients 

Internal-consistency  coefficients  are  often  used  as  estimates  of  reli¬ 
ability  coefficients.  This  must  be  done  with  care,  however,  since  the  two 
arc  only  equal  for  a  perfectly  homogeneous  test.  Sophisticated  inspection 
is  an  imperfect  guide  in  using  internal-consistency  coefficients  in  this 
way. 

The  discrepancy  between  an  internal-consistency  coefficient  and  an 
estimate  of  reliauility  obtained  from  the  correlation  between  comparable 
forms  is  somewhat  indicative  of  the  degree  of  heterogeneity  of  the  test 
items.  The  larger  the  difference  oetween  the  two,  the  greater  is  the  degree 
of  heterogeneity.  This  criterion  is  a  sure  indication  of  factorial  complex¬ 
ity.  The  reverse  is  aot  true,  however.  If  all  items  arc  factorially  complex 
in  themselves  and  to  the  same  degree,  the  test  will  be  both  highly  homo¬ 
geneous  and  factorially  complex. 

Internal  Consistency  at  the  Item  Level 

The  ultimate  criterion  of  an  item’s  consistency  with  the  rest  of  the 
test  of  which  it  is  a  part  is  the  level  of  its  correlations  with  the  other 
items.  Since  these  correlations  are  unobtainable  without  excessive  labor 
in  most  cases,  some  way  of  relating  the  item  to  total  test  score  is  used  in¬ 
stead.  Many  methods  of  doing  this  have  been  suggested,  but  the  ap r 
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proach  to  this  problem  has  been  characterized  more  by  expediency  than 
by  rationality. 

The  phi  coefficient. — The  item  statistic  used  on  most  of  the  tests  in  this 
volume  is  clearly  in  the  expedient  group,  though  it  can  be  related  more 
directly  to  a  rational  technique  than  most.  The  procedure  used  has  been 
to  compute  the  phi  coefficient  between  passing  or  failing  the  item  and  be¬ 
longing  to  criterion  groups  of  the  highest  27  percent  and  the  lowest  27 
percent  of  total  score  on  the  test.  This  procedure  has  a  number  of  ad¬ 
vantages.  The  group  of  papers  is  separated  into  high  and  low  criterion 
groups  at  the  outset;  thereafter  no  further  sorting  of  papers  is  neces¬ 
sary.  With  test  responses  recorded  on  standard  IBM  answer  sheets,  fre¬ 
quencies  of  responses  to  correct  answers  and  misleads  can  easily  be  ob¬ 
tained  by  the  use  of  the  IBM  scoring  machine  equipped  with  item-count 
attachment.  The  phi  coefficients  can  then  be  read  off  a  table  or  nomograph 
(2)  after  frequencies  have  been  transformed  into  percentages  or  pro¬ 
portions.  The  statistical  labor,  compared  with  that  in  computing  biserial 
coefficients,  for  example,  is  thus  incomparably  less. 

The  phi  coefficient  computed  in  this  way  has  a  number  of  interesting 
properties.  The  maximum  phi  is  obtained  at  a  difficulty  level  of  50  per¬ 
cent  correct  responses.  For  difficulty  levels  deviating  from  50  percent, 
the  maximum  phi  coefficients  become  progressively  lower,  while  the  samp¬ 
ling  stability  of  the  statistic  remains  unchanged.  Since  one's  aim  is  usually 
to  favor  items  near  the  50  percent  level  of  difficulty,  the  phi  coefficient 
serves  a  double  function  in  item  selection.  Use  of  this  statistic  alone 
therefore,  tends  automatically  to  produce  a  test  of  maximum  internal 
consistency  (5),  and  at  the  same  time  optimal  difficulty.  If,  however,  the 
test’s  specifications  call  for  an  appreciable  number  both  of  very  easy 
and  very  difficult  items,  item  difficulties  will  have  to  be  considered  as  well 
as  the  phi  coefficients. 

The  standard  practice  has  been  to  compute  two  phi  coefficients  for 
every  item  in  a  test.  One  is  based  upon  total  groups ;  i.  c.,  the  computa¬ 
tions  are  made  on  the  assumption  that  omissions  and  items  not  attempted 
are  wrong  answers.  This  value  is  obviously  in  part  a  function  of  the 
item's  position  in  the  test,  if  speed  is  even  a  minor  factor.  It  is  therefore 
related  to  the  internal  consistency  of  the  test  as  a  whole  under  specified 
conditions  of  administration.  The  second  is  based  upon  total  answered; 
i.  e.,  computations  arc  based  on  only  those  examinees  who  attempt  the 
item-  This  coefficient  is  indicative  of  the  internal  consistency  of  the  item 
only,  independent,  except  for  item  interactions,  of  its  position  in  the 
test.  The  distribution  constants  for  item  statistics  presented  in  the  chap¬ 
ters  following  arc  based  on  total  answered  in  order  to  give  as  true  a 
picture  as  possible  of  the  items  themselves.  Another  condition  observed 
was  that  no  phi  coefficient  was  entered  in  these  distributions,  if  it  was 
based  on  less  than  20  percent  of  the  sample  of  cases  in  either  criterion 
group.  Thus,  items  near  the  end  of  a  speeded  test  are  not  covered  by 
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these  data.  No  attempt  should  therefore  be  made  to  relate  the  mean  phi 
coefficients  reported  either  to  the  standard  deviation  of  the  total-score 
distribution  or  to  the  internal-consistency  coefficient  for  the  test  as  a 
whole.  These  data  are  particularly  inadequate  for  a  highly  speeded  test, 
since  one  does  not  expect  or  desire  individually  discriminating  items. 

An  empirical  study  of  various  item  statistics. — Table  3.2  contains  an 
empirical  comparison  of  various  item  statistics.  The  purpose  of  the 


Tabi.e  3.2. —  Comparison  of  vnrious  item  statistics  on  tzeo  samples  of  100  classified 
pilots  (68  items  from  I'isiialication  of  Maneuvers,  CI657C  zee  re  lin'd  in  these 


analyses.) 


Statistic 

Sample 

M 

SD 

r#  // 

SE> 

*M/SE 

Phi — 27  percent . 

I 

0.91 

0.062 

6.77 

II 

.... 

.... 

.... 

ITii — 50  percent . 

I 

.208 

.87 

.050 

5.84 

II 

1£ku1 

,  ,  .  . 

Flanagan  r  . 

I 

Jhkq! 

.204 

.87 

.073 

6.25 

II 

aMRol 

.202 

•  .  .  • 

,  .  .  . 

Point  biserial  r . 

.358 

.134 

.88 

.050 

.345 

.158 

,  ,  .  . 

.... 

Iliie  rial  r . 

1  mmu 

.485 

.168 

.87 

.065 

7.24 

ii 

.456 

.190 

O  ®  -»  9 

-  *  t  * 

Tetrachoric  r  . 

i 

.460 

.180 

79 

.000 

5.14 

ii 

.445 

.212 

.... 

.... 

'Computed  from  r,  ,,  and  the  average  standard  deviation  in  the  two  samplea. 


*  The  mean  rorrelations  entering  into  thi*  ratio  are,  of  course,  spuriously  high  since  an  item 
is  always  correlated  with  a  sum  in  which  it  is  included.  In  view  of  the  large  number  of  items, 
the  amount  of  error  is  very  small,  and  is  proportional  to  the  size  of  the  spuriously  high  mean 
correlation*. 


study  was  to  compare  the  sampling  stability  of  several  commonly  used 
item  statistics  in  two  representative  samples  of  400  cases  each-  The  test 
analyzed  was  Visualization  of  Maneuvers,  CI657CX1  (sec  ch.  12). 
Several  measures  of  sampling  stability  are  offered.  The  first  of  these  is 
the  correlation  between  comparable  item  statistics  in  the  two  samples. 
The  second  is  a  standard  error  of  measurement  computed  as  follows: 
S.D.y  1  —  rt  if,  in  which  SD  is  the  standard  deviation  of  the  distribution 
of  statistics  over  all  items.  The  third  is  the  critical  ratio  formed  by  di¬ 
viding  ihc  mean  item  statistic  by  the  standard  error  of  measurement. 

It  is  obvious  from  a  comparison  of  the  last  three  columns  in  table  3.2 
that  the  question  of  sampling  stability  is  answered  somewhat  differently 
by  the  three  different  criteria.  If  one  were  interested  only  in  the  rank 
order  of  item  statistics  in  a  second  sample,  there  would  be*  little  basis  for 
choice  among  the  various  item  statistics  with  the  possible  exception  of 
the  tetrachoric  correlation.  If  one  were  interested  in  a  minimal  standard 
error,  a  choice  of  either  the  point  biserial  or  phi  basal  on  all  the  data 
would  be  clearly  indicated.  Hie  writer  is  unaware,  however,  of  any  ap¬ 
plication  where  size  of  the  standard  error  alone  would  be  important. 
Lastly,  if  one  were  interested  in  detecting  nouchance  relationships,  the 
two  statistics  that  make  use  of  all  of  the  data  would  be  the  first  choice, 
followed  by  those  utilizing  extreme  criterion  groups.  These  data,  thcrc- 
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fore,  furnish  empirical  confirmation  for  Kelley's  (7)  theoretical  for¬ 
mulation. 

Since  a  biserial,  continuous  or  point,  is  computationally  laborious,  a 
procedure  using  dichotomous-criterion  groups  is  to  be  recommended.  Dis¬ 
carding  the  middle  46  percent  of  scores  on  the  continuous  variable  not 
only  reduces  the  amount  of  item  counting  to  be  done,  as  compared  to 
retaining  all  of  the  cases,  but  nonchancc  relationships  ate  more  easily 
detected  as  well.  Choice  of  phi  coefficients  or  Flanagan  (1)  r  depends  on 
whether  one  is  interested  in  a  statistic  with  a  standard  error  independent 
of  difficulty  level  or  one  in  which  the  degree  of  relationship  is  indepen¬ 
dent  of  difficulty  level.  A  statistic  having  the  first  of  these  two  charac¬ 
teristics,  i.  c.,  the  phi  coefficient,  has  been  found  to  have  many  advan¬ 
tages  for  item-analysis  purposes. 

The  intcrcorrelations  of  the  various  item  statistics  for  one  sample  only 
are  presented  in  table  3.3.  All  statistics  obviously  are  measuring  much 


Table  3.3. —  Intcrcorrelations  of  carious  item  statistics  for  a  sample  of  400  classic 
fied  pilots  (63  items  from  Visualization  of  Maneuvers,  C1657C,  were  used  in 
computing  these  correlations) 
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the  same  things.  It  is  equally  clear  that  more  than  one  factor  is  involved. 
The  two  statistics  computed  on  mutilated  distributions  arc  more  like 
each  other  than  they  are  like  anything  else’,  i.  c.,  the  correlation  between 
the  phi  coefficient  computed  on  upper  and  lower  27  percent  groups  and 
the  Flanagan  r  is  0.98.  The  correlation  between  the  tetrachoric  correla¬ 
tion  and  the  phi  coefficient  computed  on  upper  and  lower  halves  consti¬ 
tutes  another  doublet  because  they  are  computed  from  idenlical  two-by- 
two  contingency  tables.  The  correlations  with  the  point  biserial  arc  uni¬ 
formly  higher  than  those  with  any  other  measure,  which  indicates  that 
the  former  may  be  the  most  representative  item  statistic  in  the  group. 

The  point  biserial  correlation. — There  are  theoretical  reasons  why  the 
point  biserial  correlation  would  be  expected  to  lx*  the  most  representative 
internal-consistency  statistic.  The  point  biserial,  for  example,  can  be  most 
easily  and  directly  related  to  the  inter-item  product-moment  correlations, 
for  which  correlations  of  items  with  total  score  are  substituted  for 
reasons  of  computational  convenience.*  It  can  also  be  directly  retated  to 
the  standard  deviation  of  the  total  score  distribution,  and  therefore,  to 

•  »hrrc  t  and  /  are  any  l«o  itemi,  I  it  tot*!  ltd  K#r#,  •  l*  tfct 

wr,,1-!  numWr  of  item*  in  the  tot.  7|#  i*,  therefore,  tbe  mean  iatcr-ttem 

m  - 1  *  t»rrci;!ion(  am!  rft  ii  the  mean  corre!il*©rt  between  item* 

total  atott  See  ap^codii  A  for  tbe  dtt»»au««  of  lh*f  formula. 
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to  the  internal  consistency  of  the  test  as  a  whole.4  A  simple  expression  of 
the  amount  of  "bootstrapping'’  involved  in  the  inclusion  of  item  in  total 
score  can  also  be  obtained  through  the  us'  of  the  point  biserial.*  The 
highly  satisfactory  sampling  stability  of  this  statistic,  shown  by  the  three 
criteria  in  table  3.2,  arises  because  all  of  the  data  are  used  and  because 
it  is  a  product-moment  correlation,  not  an  estimate  of  one,  i.  e.,  the 
percentages  falling  in  the  pass  and  fait  categories  do  not  affect  its  stand¬ 
ard  error.  In  fact  the  only  consideration  limiting  the  usefulness  of  this 
statistic  in  internal-consistency  analyses  is  the  computational  labor  in¬ 
volved. 

Substitutes  for  the  point  biscrial. — Use  of  the  phi  coefficient  relating 
pass  and  fail  on  the  item  to  upper  and  lower  ll  percent  criterion  groups, 
as  described  in  a  preceding  section,  is  a  reasonably  good  substitute  for 
the  point  biserial.  The  characteristics  of  the  two  are  (|'.'.ite  similar,  and 
there  is  little  loss  in  the  efficiency  with  which  nonchance  relationships 
can  be  detected.  Retaining  all  of  the  cases  and  computing  the  phi  co¬ 
efficient  on  a  50-50  split  furnishes  a  statistic  that  is  more  highly  cor¬ 
related  with  the  point  biscrial,  but  at  a  cost.  The  additional  cost  is  the 
labor  involved  in  obtaining  item  counts  on  almost  twice  the  number  of 
answer  sheets,  with  a  loss  in  the  efficiency  with  which  nonchance  re¬ 
lationships  can  be  detected. 

The  phi  coefficient  obtained  from  all  of  the  cases  grouped  into  upper- 
and  lower-criterion  groups  has  one  advantage  in  that  it  is  more  flexible 
than  either  the  point  biscrial  or  the  phi  coefficient  obtained  from  the  27 
percent  criterion  groups.  The  point  biserial  between  item  and  total  score 
is  in  part  a  function  of  the  difficulty  level  of  the  test  as  a  whole,  in  ad¬ 
dition  to  being  a  function  of  the  difficulty  level  of  the  item  tested.  The 
phi  coefficient,  analogously,  is  a  function  of  the  splits  in  both  variables 
but  the  split  in  the  criterion  can  easily  be  changed  if  no  cases  are  dis¬ 
carded.  If  it  is  desired  that  the  final  test  be  rclativel)  easy,  even  though 
the  initial  items  given  for  try-out  had  an  average  difficulty  level  of  50 
percent,  the  test-score  distribution  can  be  split  at  75-25,  for  example. 
The  maximum  phi  will  accordingly  be  obtained,  on  the  average,  for 
items  having  difficulty  levels  of  75  percent,  and  the  final  selection  of 
items  will  1  v  biased  in  file  desired  direction.* 

The  pro*,  dure  of  "tailoring"  a  lest  for  a  particular  cut-off  point,  i.  c., 
selecting  items  of  the  same  difficulty  level  as  the  percentage  below  the 
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ait -off,  lias  not  been  widely  used.  There  are  a  number  of  reasons  for 
ihis.  Most  important,  a  short  test  can  not  very  well  be  tailored  for  two 
different  difficulty  levels,  one  for  the  percentage  of  failures  in  the  cri-  , 
tenon,  the  other  for  the  cut-off  in  selection  and  classification.  Both  of 
these  difficulty  levels,  in  addition,  were  unpredictable  on  a  long-term  basis. 
The  criterion,  eliminations  in  training,  fluctuated  widely  without  much 
regard  for  the  ability  level  of  the  entering  student.  Although  the  cut¬ 
off  point  in  selection  and  classification  was  raised  progressively,  the 
changes  were  relatively  rapid  and  were  usually  made  without  advance 
notice.  Most  classification  tests,  in  addition,  arc  weighted  for  more  than 
one  specialty,  each  of  which  may  require  a  different  appropriate  degree 
of  test  difficulty.  Thus  the  advantage  accruing  when  all  of  the  cases  are 
categorized,  as  opposed  to  using  highest  27  percent  and  lowest  27  percent 
criterion  groups,  was  not  needed.  Otherwise,  the  selection  of  the  procedure 
involving  omission  of  46  percent  of  the  cases  from  the  middle  of  the  dis¬ 
tribution  as  the  standard  computational  technique  would  not  necessarily 
have  been  most  advantageous. 

Item  Difficulty 

In  computing  phi  coefficients  in  internal-consistency  item  analyses,  per¬ 
centages  of  examinees  passing  an  item  in  the  upper  and  lower  criterion 
groups  arc  obtained.  The  average  of  these  two  percentages  gives  an 
estimate  of  the  difficulty  level  of  the  item.  For  various  reasons  these 
difficulty  levels  can  be  considered  merely  approximate  as  long  as  item 
counts  arc  made  only  in  the  tails  of  the  total-score  distribution.  When 
based  on  a  percentage  of  correct  responses  in  a  total  group,  the  difficulty 
level  of  the  item  is  in  part  a  function  of  its  position  in  the  test.  When 
based  on  a  percentage  of  attempts,  difficulty  level  is  independent,  except 
for  possible  item  interactions,  of  position,  but  is  biased  by  reason  ol 
selection  of  a  sample  of  those  who  attempt  many  items  in  the  test.  People 
who  attempt  many  items  arc  usually  those  who  are  most  able  in  the  test 
as  a  whole.  Vaiucs  computed  in  the  first  of  these  two  ways  are  used  in 
Kudcr-Richardson  internal-consistency  coefficients,  since  it  is  the  inter¬ 
nal  consistency,  and  therefore  the  reliability,  of  the  test  as  administered  • 
in  which  one  is  interested.  Data  concerning  difficulty  level  of  items  pre¬ 
sented  in  the  following  chapters  on  tests,  however,  follow  the  second 
procedure.  Statistics  based  upon  "total  group”  furnish  more  information 
about  the  test  as  a  whole;  statistics  based  upon  "total  answered”  furnish 
more  information  about  the  items  as  such. 

Correction  for  chance  success. — In  addition  to  being  based  upon  total 
answered,  the  item-difficulty  data  in  the  following  chapters  arc  corrected 
in  the  conventional  manner’  for  chance  success.  This  procedure  follows 
the  usual  reasoning  to  the  effect  that  the  expected  proportion  of  chance 
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success  for  a  two  choice  item  is  oiie-l  tlf,  for  a  three-choice  item  one- 
third,  etc.,  and  that  all  examinees  who  do  not  know  the  correct  answer 
guess  at  random.  As  a  matter  of  fact,  the  amount  of  random  guessing 
varies  considerably  from  test  to  test,  depending  on  the  type  of  test  and 
the  care  with  which  misleads  are  se'ected.  Often,  a  given  mislead  is 
chosen  by  the  examinee  on  the  basis  of  misinformation,  wrong  hypoth¬ 
esis,  or  perceptual  error.  The  greater  the  extent  to  which  this  is  true,  or  in 
oilier  words,  the  extent  to  which  misleads  and  correct  answer  are  not 
equally  attractive  to  examinees  who  do  not  know  the  right  answer,  the 
larger  is  the  amount  of  over-correction  that  results  from  the  application 
of  the  formula  based  upon  the  random-guessing  hypothesis. 

Evidences  again." *  the  guessing  hypothesis — Various  evidences  are 
available  concerning  the  inapplicability  of  the  guessing  hypothesis.  These 
can  only  be  briefly  listed  here.  Difficulty  levels  for  reliable  items,  for 
example,  sometimes  correct  to  zero  or  even  to  negative  values.  A  test, 
secondly,  can  sometimes  lx-  made  more  internally  consistent  and  reliable 
by  an  appropriate  reduction  in  the  number  of  misleads.9  Lastly,  the  cor- 
i  voted  difficulty  level  of  an  item  or  a  test  does  not  ahvays  remain  con¬ 
stant,  as  the  number  and  character  of  misleads  is  varied.®  These  con¬ 
siderations  lead  to  the  conclusion  that  item  difficulty  is  not  very  closely 
associated  with  number  of  misleads  in  some  tests.  When  difficulty  values 
are  given  for  tests  in  later  chapters,  therefore,  there  if  a  bias  in  the  direc¬ 
tion  of  overestimation  of  difficulty  rather  than  underestimat'on. 

VALIDITY 

Validity  data  citeu  m  this  report  are  based  on  an  extremely  practical 
definition  of  the  concept.  The  validity  of  a  test  is  its  relation  to  any  vari¬ 
able  one  is  interested  in  predicting.  A  test  has  potentially  as  many  validi¬ 
ties,  therefore,  as  there  are  criteria  available. 

Validation  Statistics 

It  was  pointed  out  in  the  preceding  chapter  that  the  most  common  cri¬ 
teria  of  success  as  pilots,  navigators,  or  bombardiers  were  pass- fail  data. 
The  prediction  of  graduation  or  elimination  is  the  most  useful  datum  to 
those  concerned  with  training  problems;  information  concerning  gradu¬ 
ation  or  elimination  was  also  easiest  to  obtain  of  all  criteria  available. 

The  hiserial  correlation  coefficient. — With  test  data  for  all  practical 
purposes  continuously  distributed  and  a  dichotomous  criterion,  the  bi¬ 
serial  correlation  coefficient  is  immediately  suggested.  This  has  indc  l 

1  T hf  mean  j-l»i  in  a  variable  2-,  }■,  or  4-choicc  100-item  test.  Geography,  AS104,  was  raised 
0.023  (from  0.307  to  0.330)  based  on  "total  group"  and  0.038  (from  0.27i  to  0.300)  based 
on  total  answered,  change*  that  were  beyond  the  5  percent  and  1  percent  levels  of  significance 
resj  ectivcly,  over  the  mean  internal-consistency  values  in  the  otherwise  identical  100-item  5- 
choice  version,  AS102.  Tins  was  true  even  though  the  mean  phi  of  the  212  eliminated  mis¬ 
leads  was  — 0.0U  based  on  “total  Kioop”  and  -  O.OCiS  based  on  "total  answered”;  i.  e.,  the 
average  eliminated  mislead  had  discriminated  between  high-  arid  low-criterion  groups  in  the 
ext-eited  direction. 

*  The  mean  difficulty  level  of  Mechanical  information,  CIOO.'A  is  0.48  when  corrected  for 
chan.e  success.  The  comparable  value  for  Ci'MSHXI,  an  experimental  two-choice  version  iden¬ 
tical  with  the  fust  form  except  for  the  omission  of  three  misleads  in  every  item,  is  0.38.  This 
difference  is  beyond  the  I  percent  level  of  significance. 
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been  the  standard  validation  statistic  used  in  the  test-construction  pro¬ 
gram.  Its  chief  advantage  is  its  independence  of  split  in  the  criterion  so 
that  test  validities  computed  at  one  period  of  time  can  be  compared  with 
validities  computed  at  a  later  date  without  regard  to  differences  in  elim¬ 
ination  rates.  The  fact  that  the  biserial  correlation  also  gives  an  estimate 
of  what  the  product-moment  correlation  would  have  been  if  the  criterion 
had  been  continuously  and  normally  distributed  is  perhaps  satisfying, 
though  whether  this  constitutes  an  advantage  in  prediction  is  debatable. 

Use  of  the  biserial  correlation  coefficient  is  subject  to  one  serious  draw¬ 
back.  Formulas  commonly  used  to  correct  for  restriction  of  range  are 
not  strictly  applicable  to  biserial  correlations.  The  greater  the  amount  of 
the  restriction  and  the  greater  the  disparateness  of  the  split  on  the  di¬ 
chotomy,  the  greater  is  the  error  involved.  Because  of  the  high  number 
of  disqualifications  for  low  aptitude  at  the  time  of  classification,  navi¬ 
gator  validities  were  somewhat  underestimated  almost  from  the  start; 
pilot  and  bombardier  validities  were  significantly  underestimated  only 
after  many  months  of  the  test-construction  program,  as  the  number 
of  low-aptitude  disqualifications  increased.  For  the  degree  of  restriction 
of  range  due  to  the  elimination  of  the  lower  00  percent  or  more  of 
scores  on  the  various  stanines'0  and  for  the  elimination  rates  commonly 
encountered  in  flying  training,  the  amount  of  error  in  the  corrected  bi¬ 
serial  correlation  may  amount  to  as  much  as  0.10. 

The  point  biscrial. — An  alternative  statistic  for  use  in  the  validation  of 
a  test  against  a  dichotomous  criterion  is  the  point  biserial.  The  applications 
of  this  statistic  to  psychological  problems  have  not  been  sufficiently  inves¬ 
tigated  to  make  definite  recommendations.  One  disadvantage  is  immedi¬ 
ately  suggested — the  point  biserial  is  in  part  a  function  of  the  split  in 
the  dichotomous  criterion.  In  order  to  be  compared,  test  validities  would 
have  to  be  equated  for  differences  in  elimination  rates.  The  fact,  how¬ 
ever,  that  the  point  biserial  is  a  product- moment  correlation,  not  an  es¬ 
timate  of  one,  suggests  that  it  might  be  useful. 

Validation  Procedures 

Experimental  tests  were  most  often  given  to  unclassified  aviation  stu¬ 
dents.  Ideally,  the  aviation  students  should  not  be  able  to  distinguish  an 
experimental  test  from  a  classification  lest  on  which  their  qualification 
and  classification  depended.  Thus,  much  validation-test  administration 
was  conducted  along  with  classification  testing.  The  first  step  in  the  vali¬ 
dation  procedure  was,  therefore,  to  obtain  the  classification  records.  If 
one  were  interested  in  pilot  validation,  the  classified  pilots  were  separated 
from  navigators,  bombardiers,  etc.  After  the  necessary  interval  of  time, 
rosters  of  graduates  and  climinces  from  elementary  pilot  training  were 
searched11  for  men  to  whom  the  test  was  administered. 

“A  st.aninc  is  ■  standard  score,  on  a  9-step  scale  with  a  mean  of  5.00  and  a  standard  de»t* 
atiot.  of  2.00,  which  represents  the  composite  aptitude  score  for  a  givm  lype  of  flying  training. 

11  Readers  who  arc  familiar  with  punch-card  techniques  can  immediately  translate  this  and 
other  steps  its  the  procedure  to  jobs  of  sotting,  collating,  tabulating,  etc. 
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Selection  of  climince  group. — Of  the  several  types  of  eliminees  listed 
by  training  schools,  e.  g.,  flying  deficiency,  academic  deficiency,  fear  of 
flying,  traits  of  character,  and  physical  deficiency,  nonpsychological  cate¬ 
gories  constituting  a  small  percentage  of  total  eliminees  were  consistently 
omitted  in  the  computation  of  the  biscrial  correlations.  In  periods  of  high 
elimination  rates,  in  addition,  pilot  validation  was  often  accomplished  on 
the  basis  of  flying-deficiency  eliminees  only.  This  was  done  because  flying 
deficiency  was  presumably  the  purest  criterion  available  of  pilot  failure. 
Data  are  available,  however,  which  show  that  stated  type  of  elimination 
from  pilot  training  is  relatively  unimportant  as  measured  by  the  degree 
of  relationship  between  the  composite  aptitude  score  (stanine)  and  the 
various  categories  of  eliminations. 

The  correlations  obtained. — In  addition  to  computing  the  biscrial  cor¬ 
relation  between  the  test  and  pass-fail  in  training,  certain  other  correla¬ 
tions  were  routinely  obtained  as  well.  The  biscrial  correlation  between 
the  appropriate  stanine  and  the  criterion  was  obtained  for  the  same 
sample  as  a  partial  check  on  the  representativeness  of  the  sample.  A 
product-moment  correlation  was  obtained  between  the  test  and  stanine 
in  order  to  estimate  the  amount  the  test  might  add  to  the  predictive 
efficiency  of  the  classification  battery.  The  latter  two  correlations  are  not 
presented  routinely  in  this  volume  because  a  stanine  does  not  constitute 
a  fixed  reference  point,  since  the  composition  of  stanincs  was  changed 
with  each  new  classification  battery.  Both  correlations  are  used,  however, 
in  correcting  a  test  validity  for  restriction  of  range.  These  corrected 
values  are  routinely  presented  for  tests  administered  after  the  time  that 
restriction  of  range  became  a  serious  problem. 

Correction  for  restriction  of  range. — The  standard  formula11  for  cor¬ 
recting  a  test  validity  for  restriction  in  range  is  derived  from  the  formula  1 
for  the  partial  correlation  coefficient.  Restriction  in  range  of  a  variable 
(the  stanine)  has,  as  a  limit,  the  reduction  of  the  variable  to  a  constant,  i 
with  effects  on  related  variables  (test  and  criterion)  predictable  from  the 
partial  correlation  technique.  In  a  real  sense,  therefore,  the  partial  corre¬ 
lation  coefficient  is  a  special  case  of  restriction  of  range. 

The  effect  of  restriction  of  range  on  a  test's  validity  depends  upon  its 
correlation  with  the  stanine.  It  is  not  difficult  to  see  why  a  test’s  validity 
should  be  increased  when  correcting  for  restriction  of  range  when  the 
test -  is  highly  correlated  with  the  stanine.  The  correction  will  often 
change  a  small  negative  validity  to  a  positive  one  if  the  correlation  with 
stanine  is  substantial.  Unless  the  relationship  of  the  correction  formula 
to  the  partial  correlation  coefficient  is  remembered,  however,  it  may  be 


Vf( 


r,,+ru  f,.  (-“ - 1) 


i  )]  [l+r»»(^-l  )] 


(or  restriction  In  range  of  variable  J  is  r'is.  See  report  No, 
evaluation  and  description  o(  correction*  (or  restriction  o(  range. 


where  Si  is  the  unrestricted  standard 
deviation  at  variable  1;  si  is  the  re¬ 
stricted  standard  deviation  on  the 
sample,  and  the  correlation  corrected 
1  oi  this  scries  (or  a  more  complete 


surprising  to  find  a  test’s  validity  lowered  by  applying  the  correction 
formula  when  the  test  is  relatively  uncorrclatcd  with  the  stanine.  A  num¬ 
ber  of  instances  of  this  sort  will  be  found  in  the  chapters  to  follow. 

Most  of  the  corrected  validities  reported  in  this  volume  arc  based  upon 
an  assumed  standard  deviation  of  the  unrestricted  stanine  of  2  00.  This 
follows,  of  course,  from  the  definition  of  stanine.  During  a  period  of 
several  months,  however,  it  was  believed  that  in  the  process  of  setting 
up  conversion  tables  of  raw  aggregate  scores  to  stanine  units,  standard 
deviations  of  stanir.e  significantly  less  than  2.00  had  been  obtained.  Cer¬ 
tain  validities  arc  reported  as  corrected  to  values  of  the  unrestricted  sta¬ 
nine  of  less  than  2.00.  The  actual  value  used  in  making  the  correction 
is  consistently  footnoted. 

A  further  source  of  confusion  in  making  corrections  for  restriction  <4 
range  is  that  at  times  data  were  available  for  the  augmented  pilot  sta¬ 
nnic  only.  The  pilot  stanine  was  augmented  by  adding  cither  two  or  three 
stanine  points  to  the  aptitude  score  of  students  with  specified  amounts 
of  previous  flying  experience.1*  The  proportion  affected  by  tins  procedure 
varied  from  time  to  time,  but  was  usually  in  the  range  from  0.10  to  0.15. 
The  standard  deviation  of  the  augmented  stanine  was  therefore  greater 
than  that  for  the  unaugmented,  the  difference  usually  being  about  0.10. 
The  assumed  standard  deviation  of  the  unrestricted  augmented  stanine 
was  accordingly  2.10  during  most  of  the  period  covered  by  this  volume. 
The  exceptions  noted  above  for  the  unaugmented  stanipe,  however,  also 
apply  to  the  augmented. 

Item  Validation  Procedures 

Item  vc'  dation  was  used  primarily  in  the  selection  of  items  and  for 
keying  responses  in  personality  inventories,  biographical  data  blanks, 
etc.  Ease  of  computation  is  always  a  criterion  in  dny  work  with  items, 
but  beyond  that  is  the  need  for  a  statistic  that  will  produce  valid  em¬ 
pirical  keys. 

The  tctrachoric  correlation  coefficient. — Since  item  validation  neces¬ 
sarily  involves  a  two-by-two  contingency  tabic  when  the  criterion  is  di¬ 
chotomous,  the  tctrachoric  correlation  is  suggested  as  a  suitable  statistic 
It  was  used-in  fact  in  a  number  of  item-validation  studies.  Its  chief  ad¬ 
vantage  is  that  the  degree  of  correlation  is  independent  of  varying  elim¬ 
ination  rates  or  item  difficulties.  Its  standard  error,  on  the  other  hand,  is 
a  function  of  both  these  variables.  For  sampling  reasons,  high  correla¬ 
tions  tend  to  occur  predominantly  on  very  easy  or  very  difficult  items. 
Unless  some  criterion  of  statistical  significance  is  used  in  addition  to  the 
correlation,  item  selection  will  be  biased  away  from  items  of  moderate 
difficulty  level  towards  very  easy  or  very  difficult  items.  Such  items  are 
less  likely,  because  of  sampling  errors,  to  give  comparable  resutts  on  a 
second  administration  and  do  not  afford  maximum  discrimination  among 
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examinees.  For  these  reasons  the  tctrachoric  correlation  was  not  used  as 
widely  as  the-  statistic  to  he  described. 

The  phi  coefficient. — -The  phi  coefficient  can  be  computed  from  the 
same  contingency  table  as  the  tctrachoric  correlation.  It  has  quite  dif¬ 
ferent  properties,  however,  since  its  standard  error  is  independent  of  split 
in  either  dichotomy,  while  the  size  of  the  correlation  is  a  function  of 
split.  Assuming  a  constant  level  of  item  intercorrelations,  the  mean  phi 
coefficient  between  test  items  and  the  criterion  can  be  directly  related  to 
the  point  biserial  correlation  between  total  test  score  and  the  criterion.14 

For  an  item  that  discriminates  positively,  phi  is  at  a  maximum  when 
the  number  marking  a  given  alternative  equals  the  number  in  the  supe¬ 
rior  criterion  group  (graduates).  I'or  maximum  negative  discrimination, 
however,  phi  is  at  a  maximum  when  the  number  marking  a  given  al¬ 
ternative  equals  the  number  in  the  inferior  criterion  group  (eliminees). 
If  this  statistic  were  used  unmodified,  items  selected  for  keying  at  one 
level  of  the  graduation  rate  would  not  be  the  best  items  to  use  if  this 
rate  were  to  change  radically. 

The  computation  of  phi  coefficients  was  slightly  modified  in  practice 
as  a  means  of  partially  overcoming  the  dependence  of  the  statistic  on  a 
given  graduation  rate.  All  item-validation  statistics  were  computed  on  the 
assumption  of  equal  weighting  of  graduate  and  elimincc  groups;  i.  e.,  the 
same  charts  were  used  for  item  validation  as  for  internal-consistency 
computations,  where  the  upper  and  lower  criterion  groups  were  always 
of  equal  size.  This  procedure  has  the  effect  of  increasing  all  item  phi 
coefficients,  but  increasing  most  those  for  splits  furthest  removed  from 
the  one  where  phi  would  be  at  a  maximum  in  the  more  precise  procedure. 
It  follows,  therefore,  that  the  application  of  a  standard-error  formula 
is  an  approximate  procedure,  although  a  single  standard  error  can  be 
used  more  precisely  for  phi  computed  in  this  way  than  for  the  tetra- 
choric  correlation. 

No  perfectly  satisfactory  item  validation  statistic  was  used.  If  the 
graduation  rate  from  training  were  more  constant,  the  phi  coefficient 
computed  from  the  actual  split  in  the  criterion  would  be  less  open  to 
criticism.  With  changing  graduation  rates  the  “compromise”  procedure 
may  be  more  generally  useful. 

Cross  validation. — The  development  of  a  key  for  a  personality  or  bio- 
graphical-data  inventory  can  be  very  briefly  summarized.  The  first  step 
involves  the  experimental  administration  of  the  inventory.  This  was 
done  either  during  the  classification  period  or  while  students  were  in  pre- 
llight  training,  or  at  the  time  of  graduation  from  preflight  training.  The 
accumulation  of  2,000  cases  of  classified  students  would  have  been  dcsir- 
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able,  although  this  was  not  always  possible.  When  graduation-elimination 
information  became  available,  the  unscored  answer  sheets  were  separated 
into  those  for  graduates  and  those  for  eliminccs.  The  answer  sheets  of 
these  graduate  and  elimince  groups  were  then  divided  into  odds  and 
evens,  usually  on  the  basis  of  odd  and  even  testing  numbers.  Item 
counts  were  obtained,  as  a  next  step,  and  item  validities  were  computed 
for  odds  ami  evens  separately.  Using  responses  exhibiting  a  difference 
at  or  beyond  approximately  the  5  percent  level  of  significance,  and 
avoiding  the  alternatives  selected  by  a  very  small  (usually  less  than  0.15) 
or  very  large  proportion  of  the  cases,  separate  scoring  keys  were  made 
up  for  the  two  samples.  The  odds  answer  sheets  were  then  scored  with 
the  evens  key  and  vice-versa.  Validation  statistics  were  then  computed 
in  the  way  already  outlined. 

The  cross-validation  procedure  avoids  the  "bootstrapping’'  involved 
in  scoring  answer  sheets  on  a  key  prepared  from  item-validation  statis¬ 
tics  computed  on  the  same  sample.  If  both  the  odds  and  evens  keys  are 
valid,  the  final  recommended  key  obtained  from  combining  the  two  ex¬ 
perimental  keys  will,  on  the  average,  be  more  valid. 

FACTOR  ANALYSIS 

The  previous  chapter  discussed  the  importance  attached  to  factor 
analysis  in  test  construction  research.  In  the  present  section  will  be  dis¬ 
cussed  some  of  the  statistical  and  computational  aspects  of  the  technique. 

In  this  connection  several  factor-analysis  schools  will  be  briefly  covered. 

Common  Assumptions 

Factor  analysts,  no  matter  how  much  they  may  differ  among  them¬ 
selves  on  certain  points,  make  common  assumptions  in  their  factor  solu¬ 
tions.  Any  given  distribution  of  test  scores,  for  example,  is  assumed  to 
result  from  a  weighted  additive  combination  of  orthogonal1*  reference 
factors.  The  correlation  between  any  two  variables,  therefore,  is  also  an 
additive  combination.  It  can  be  written  as  follows:  rlt=atat+blb,-\'.  .  .  • 
ktkt,  where  at  is  the  loading  in  factor  o  of  test  1,  etc.,  and  where  k  is  the 
last  factor  in  the  analysis. 

Many  critics  of  factor  analysis  have  seized  upon  the  additive  assump¬ 
tion  as  a  possible  weak  link.  The  additive  assumption  does  not  allow  for 
complex  interactions  of  parts,  for  the  whole  being  unpredictable  from 
knowledge  of  the  parts,  or  for  parts  being  unrecognizable  in  the  whole. 
This  is  a  question,  however,  to  which  an  experimental  answer  is  possible. 
Evidence  is  presented  in  chapter  28  showing  to  what  extent  the  additive 
assumption  has  been  found  to  correspond  to  test  tlata. 

Divergent  Computational  Procedure* 

Any  casual  statistical  reader  knows  the  centroid  method  of  factor 
analysis  by  name  and  associates  it  with  Thurstonc  (9).  Almost  equally 
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well  known  are  the  principal-axes  method  of  Hotelling  (4),  and  the 
principal-components  method  of  Kelley  (6).  This  same  reader  knows 
that  there  arc  certain  disagreements  among  these  individuals  concerning 
methodology,  but  is  usually  quite  uncertain  concerning  the  actual  points 
of  disagreement 

The  mathematical  solutions. — There  is  little  difference  of  opinion  con¬ 
cerning  the  mathematical  solutions.  Either  the  principal-axes  method  or 
the  principal-components  method  is  superior  to  the  centroid  solution 
mathematically,  but  inferior  from  the  standpoint  of  computational  labor 
involved.  The  mathematical  superiority  of  the  first  two  is  due  primarily 
to  the  fact  that  each  succeeding  factor  extracts  a  maximum  portion  of 
remaining  variance. 

It  is  often  stated  that  the  first  two  methods  are  not  scientifically  par¬ 
simonious  ;  i.  c.,  because,  as  commonly  used,  as  many  factors  are  extracted 
as  there  arc  variables  in  the  correlational  matrix.  This  criticism  is  not 
justified.  If  all  methods  are  applied  to  the  same  correlational  matrix  with 
the  same  diagonal  entries,  there  is  no  difference  in  parsimony  in  favor  of 
the  centroid  method.  Use  of  1.00  in  the  diagonal  results  in  as  many  fac¬ 
tors  (not  necessarily  all  reliable)  as  variables.  If  the  communality  is  used 
as  the  diagonal  entry  instead,  no  more  factors  need  be  retained  by  one 
method  than  by  any  other.  If  the  factors  computed  by  various  methods 
arc  compared  by  ordinal  number,  the  centroid  factors  will  be  found  to 
reduce  the  variance  in  the  matrix  less  sharply  than  the  other  two  methods. 

Diagonal  entries. — The  real  crux  of  the  differences  among  factor 
analysts  lies  in  the  selection  of  the  diagonal  entry  in  the  correlational 
matrix.  This  in  turn  is  directly  related  to  the  problem  of  whether  to 
rotate  axes. 

The  advocates  of  the  use  of  1.00  as  the  diagonal  entry  seem  to  value 
most  highly  the  mathematical  advantages  that  accrue  when  this  procedure 
is  used.  Being  able  to  assess  the  reliability  of  a  factor  is  indeed  a  consid- 
«  erablc  advantage.  The  writer,  among  others,  is  unable,  however,  to  find 
many  psychological  ad\antagcs  in  this  procedure. 

If  1.00  is  used  as  the  diagonal  entry  in  a  correlation  matrix  composed 
of  coefficients  uncorrccted  for  attenuation,  the  resulting  factors  and  fac¬ 
tor  loadings  constitute  an  inextricable  mixture  of  common  factors,  non- 
error  specifics,  and  error  specifics.  These  factors  are  probably  not  very 
meaningful,  although  they  furnish  an  exact  mathematical  description  of 
the  original  correlations.  Rotations  arc  attempted  only  rarely  on  such  data. 
It  is  the  writer’s  guess,  however,  that  stable  positions  of  axes  cannot  be 
found  in  these  analyses. 

The  advocates  of  the  use  of  the  test’s  communality,  i.  e.,  the  sum  of 
the  squares  of  the  common-factor  loadings,  in  the  diagonal  forego  math¬ 
ematical  nicety  for  greater  psychological  meaning.  Rotations  can  be  made 
to  positions  of  the  factors  that  will  reoccur  in  subsequent  analyses,  rela¬ 
tively  independently  of  the  constitution  of  the  particular  matrices.  Experi- 
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encc  has  shown  that  these  factors  do  have  psychological  meaning.  For 
example,  interpretation  of  stable  factors  will  enable  the  test  constructor, 
in  revising  tests,  to  increase  or  decrease  a  loading  in  a  given  factor  at 
will.  Although  few  data  are  available,  it  is  probable  that  psychological 
interpretation  of  a  factor  can  be  related  to  job-analysis  information  suffi¬ 
ciently  accurately  to  weight  tests  for  a  criterion  in  the  absence  of  valida¬ 
tion  data  for  these  tests  in  connection  with  that  crite’  ion. 

Procedures  Used  in  Army  Air  Forces  Test  Research 

Because  most  matrices  to  be  analyzed  were  relatively  large,  the  cen¬ 
troid  procedure  was  used  for  computational  convenience.  In  order  to  ob¬ 
tain  meaningful  common  factors,  estimates  of  the  communality  were  used 
as  the  diagonal  entries.  Communalitics  were  estimated  by  selecting  i  ;« 
highest  coefficient  in  a  column. 

Centroid  computations. — Centroid  loadings  were  computed  in  the  cus¬ 
tomary  manner  with  one  exception.  The  criterion  used  in  reflecting  signs 
was  the  algebraic  sum  of  a  column  disregarding  the  diagonal  rather  than 
the  mere  number  of  negative  signs  in  the  column.  The  newer  procedure 
insures  positive  sums  and  undoubtedly  comes  closer  to  maximizing  table 
totals  than  the  earlier  one. 

Every  user  of  the  centroid  method  finds  himself  perplexed  by  the 
problem  of  the  number  of  factors  to  extract.  No  single  criterion  is  suffi¬ 
cient.  Most  of  the  criteria  that  have  been  suggested  do  not  allow  the  ex¬ 
traction  of  a  sufficient  number  of  factors  to  obtain  stability  of  factor 
patterns  in  rotations.  An  objective  criterion  that  has  been  found  to  be 
useful  is  the  comparison  of  the  standard  error  of  a  zero  correlation  with 
the  product  of  the  two  highest  centroid  loadings.  Factoring  should  not  be 
stopped  until  the  latter  is  at  least  as  small  as  the  former.  The  criterion 
that  was  actually  given  final  weight,  however,  was  quite  subjective.  Inter- 
prctability  of  the  results  is  the  only  possible  basis  at  present  for  choosing 
between,  for  example,  9  or  10  factors.  In  most  cases  the  objective  differ¬ 
ences  between  two  successive  centroid  factors  arc  too  slight,  and  the 
change  too  smooth,  to  make  a  confident  decision  concerning  the  exact 
number  of  factors. 

Rotations. — Axes  were  always  rotated  in  pairs.  This  was  accomplished 
in  various  ways.  At  first  the  factors  were  plotted  iu  order  to  estimate  the 
angle  of  rotation,  and  rotated  loadings  were  obtained  by  calculating  ma¬ 
chine  using  the  trigonometric  functions  of  the  angle.  With  more  experi¬ 
ence  the  angle  of  rotation  was  estimated  from  the  numerical  values  alone, 
and  the  rotated  loadings  were  obtained  as  before. 

The  original  procedure  was  time  consuming;  the  second  procedure  in¬ 
volved  a  difficult  visualizing  process.  With  both,  computers  had  difficulty 
with  signs.  As  a  result  a  new  procedure  was  devised  *•  that  minimized 
most  of  the  difficulties  encountered  with  the  previous  ones.  A  pair  of  fac¬ 
tors  is  plotted  by  projection,  utilizing  T-square  and  trianglf,  from  the 
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original  plot  of  each  onto  a  third  sheet  of  paper.  Rotations  are  made 
directly  on  the  plot,  and  the  new  axes  are  used  in  succeeding  rotations. 
The  entire  process  is  geometric.  Numerical  values  of  the  various  factor 
loadings  are  not  involved  from  the  time  the  first  plots  are  made  until  the 
final  rotated  loadings  are  read  off  the  sheets.  Rotations  arc  made  more 
accurately  and  more  rapidly  than  by  any  other  method  tried. 

All  the  rotational  solutions  presented  in  this  volume  arc  orthogonal. 
Nonorthogonal  solutions  were  not  attempted.  It  seemed,  in  the  first  place, 
that  those  who  use  nonorthogonal  solutions  place  too  much  confidence  in 
essentially  negative  results.  How  can  one  be  sure  that  the  next  group  of 
tests  will  not  change  the  correlation  between  two  factors  from  +0.15  to 
0.00?  As  a  matter  of  fact,  most  of  the  intercorrelations  reported  among 
factors  are  so  low  that  it  hardly  seems  worth  while  to  depart  from  orthog¬ 
onality,  even  if  one  could  be  sure  that  the  correlations  were  the  true  ones. 

CONCLUSIONS 

This  chapter  discusses  many  statistical  procedures  common  to  the  test 
analyses  and  descriptions  to  be  presented  in  succeeding  chapters.  On  the 
basis  of  rather  extensive  experience  with  certain  techniques,  evaluations, 
and  recommendations  were  also  made  in  several  instances.  The  topics  dis¬ 
cussed  were :  Reliability ;  internal  consistency  both  of  the  test  as  a  whole 
and  of  the  items  composing  it;  validity,  again  both  of  the  test  and  its 
items;  and  factor  analysis.  A  more  complete  discussion  of  research  tech¬ 
niques  will  be  found  in  Report  No.  3  of  this  scries. 

TABULAR  SYMBOLS 

For  the  convenience  of  the  reader,  a  list  of  tabular  symbols  commonly 
used  in  this  volume  is  appended  below: 

N»  =  Total  number  of  cases  in  a  sample. 

(>t  =  Proportion  of  total  sample  graduated  from  the  indicated 

phase  of  training. 

M,  =  Mean  score  of  graduates. 

M,  =  Mean  score  of  climinces. 

SD,  =  Standard  deviation  of  score  distribution  for  the  complete 
sample,  including  graduates  and  climinecs. 

fki$  ~  Miserial  correlation  coefficient  between  test  scores  and  the 
criterion,  uncorrected  for  restriction  in  range  on  the 
stanine. 

,r4(,  =  rM,  corrected  for  restriction  of  range  on  the  stanine. 

r',,  ~  Product -moment  correlation  between  scores  on  separate 

comparable  halves,  separate  comparable  forms,  or  odd  and 
even  groups  of  questions,  of  a  test. 

r, i  =  r'u  corrected  for  length  by  the  Spearman- Brown  formula. 

M$  =  Mean  phi  coefficient. 

SD^  =  Standard  deviation  of  the  distribution  of  phi  coefficients. 
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P  —  Uncorrected  proportion  of  individuals  passing  an  item. 
tp  —  p  corrected  for  chance  success. 

The  reader  will  be  able  to  interpret  minor  variants  of  those  symbols  as 
they  occur  in  this  volume. 

One  liberty  is  taken  in  tables.  For  convenience,  product -moment  corre¬ 
lations  are  frequently  entered  in  a  column  for  When  this  occurs,  ap¬ 
propriate  footnotes  arc  always  made. 
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CHAPTER  fflOR. 


Tests  of  Intellect  and  Information1 


THREE  GENERAL  TEST  CATEGORIES 


The  presentation  of  research  with  printed  tests  is  divided  into  three 
sections:  intellect  and  information,  perception,  and  temperament.  This 
follows  the  original  division  of  research  responsibilities  among  Psycho¬ 
logical  Research  Unit  No.  3;  Psychological  Section,  Headquarters,  AAF 
Training  Command;  and  Psychological  Research  Unit  No.  1,  respec¬ 
tively.*  The  coding  system  used  for  printed  tests  parallels  this  division  of 
research  responsibilities,  and,  therefore,  the  organization  of  the  volume.* 
While  the  relationship  between  coding  system  and  volume  organization  is 
not  perfect,  most  of  the  tests  in  the  Cl  series  will  be  found  in  chapters  5 
through  14,  those  in  the  CP  scries  in  chapters  16  through  21,  and  those 
in  the  CE  series  in  chapters  23  through  27.  The  existence  of  exceptions  to 
this  relationship  is  indicative  of  the  absence  of  sharp  lines  of  demarcation 
bet ’-wen  the  categories,  e.  g.,  certain  tests  that  were  once  given  a  per¬ 
ceptual  or  a  temperament  code  number  were  considered  to  be  more  similar 
to  intellectual  tests  when  the  volume  was  lieing  organized. 

Although  other  ways  of  categorizing  tests  might  have  been  devised,  the 
system  used  has  the  advantage  of  meaning  much  the  same  thing  to  most 
psychologists.  In  spite  of  some  degree  of  overlapping  of  categories,  there 
will  be  relatively  few  disagreements  about  the  placement  of  most  tests, 
although  psychologists  of  dillcrent  backgrounds  may  describe  the  cate¬ 
gories  quite  differently.  The  following  statements,  in  brief,  constitute  one 
such  description.  The  intellectual  category  can  be  distinguished  from  the 
perceptual  by  the  use  of  symbolization  in  the  statement  of  the  questions 
and  misleads.  Intellectual  tests  require  sym!>o!ic»  mediation,  usually  by 
verbal  or  numerical  symbols;  perceptual  tests  <!o  not.  Loth  differ  from 
temperament  tests  in  that  the  latter  stress  manner  or  way  of  behaving, 
while  the  other  two  stress  amount  of  knowledge  or  ability.  The  correct 
answer  to  ;.n  item  *n  an  intellectual  or  perceptual  test  follows  inflexibly 
from  a  set  of  rules,  whiic  there  is  no  right  answer  in  this  sense  to  an  item 
in  a  temperament  test. 
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HISTORICAL  KKASOINS  FOR  TIIK  USE  OF  TESTS  OF 
INTELLECT  AND  INFORMATION 

Academic  Requirements  for  Flying  Training 

Beginning  in  1920,  high -school  graduation  or  its  equivalent  was  made 
a  requirement  for  entrance  into  Army  aviation  training.  Aviation  flying 
training,  at  that  time,  consisted  of  training  pilots  only,  although  pilots 
studied  navigation  as  well,  frequently  subsequent  to  graduation  from  fly¬ 
ing  school.  In  order  to  determine  the  qualifications  of  candidates  who  had 
not  graduated  from  high  school,  administration  of  examinations  was 
authorized.  These  examinations  were  expected  to  cover  pertinent  high- 
school  subjects.  By  1925,  two  such  examinations  were  scheduled  each  year. 

In  1927,  the  educational  requirement  was  increased  to  2  years  of  col¬ 
lege  or  its  equivalent,  owing  to  the  increase  in  the  number  of  applicants 
for  aviation  training.  Candidates  who  did  not  have  2  years  of  college 
training  were  given  a  special  essay-type  examination  covering  nine  college 
subjects.  This  procedure  was  followed  until  the  imminence  of  our  in¬ 
volvement  in  war  demanded  a  more  extensive  selection  and  classification 
program. 

The  Substitution  of  Objective  Examinations 

The  need  for  a  more  objective  and  standardized  qualification  instru¬ 
ment  led  to  a  request  by  the  Air  Corps  that  the  Personnel  Procedures 
section  of  the  Adjutant  General’s  oflice  construct  an  objective-type  educa¬ 
tional  examination  to  be  used  for  air-crew  selection.  This  examination, 
completed  late  in  1941,  consisted  of  five  required  sections,  four  of  which 
were  mathematical  and  the  fifth,  English  composition.  Five  additional 
subjects — general  history,  United  States  history,  physics,  chemistry,  and 
a  language — were  listed  from  which  two  options  could  be  chosen. 

After  being  in  use  for  only  2  months,  the  objective  educational  exami¬ 
nation  was  supplanted  in  January  1942  by  the  Aviation  Cadet  Qualifying 
Examination.  A  month  later  the  testing  program  was  extended  to  include 
a  battery  of  classification  tests  which  were  administered  to  all  who  had 
qualified  for  air-crew  training. 

A  natural  consequence  of  the  substitution  of  objective  testing,  in  place 
of  the  prerequisite  of  2  years  of  college,  was  the  suggestion  that  tests  of 
intellect  and  information  be  included  iv  die  classification  battery.  It  was 
not  clear  at  the  outset  whether  2  years  in  college  or  its  equivalent  was 
predictive  of  success  as  a  pilot,  bombardier,  or  navigator,  though  the 
case  for  the  latter  at  least  was  fairly  clear-cut.  Studies  of  these  relation¬ 
ships,  however,  constituted  an  obvious  first  step, 

JOB  ANALYSIS  FINDINGS  IN  RELATION  TO  TESTS  OF 
INTELLECT  AND  INFORMATION 

The  Pilot 

The  analysis. of  faculty-board  proceedings,  mentioned  in  chapter  1,  re- 
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suited  in  the  establishment  of  four  categories  in  the  area  of  intellect  and 
information.  These  categories  were  generally  defined  in  terms  of  expres¬ 
sions  used  by  instructors  in  describing  reasons  for  elimination. 

Judgment. — Ability  to  make  sound  judgments  and  choices  as  to  the 
best  thing  to  do  when  faced  with  practical  problems  in  traffic,  in  making 
forced  landings,  and  in  similar  situations. 

Foresight  and  planning. — Ability  to  plan  a  series  or  sequence  of 
maneuvers,  plan  ahead  for  landings,  plan  entry  into  or  exit  from  traffic, 
and  foresee  and  avoid  possible  difficulties. 

Memory. — Ability  to  remember  instructions  from  day  to  day,  both  gen¬ 
eral  explanations  and  specific,  detailed  information. 

Comprehension. — Ability  to  grasp  the  meaning  of  explanations,  in¬ 
structions,  and  demonstrations  when  they  are  given  either  orally  or  in 
written  form. 

In  spite  of  the  fact  that  comments  appeared  in  one  or  more  of  these 
categories  in  reports  of  68%  of  all  eliminations  from  pilot  training,  it 
was  realized  at  the  outset  that  the  importance  and  psychological  unique¬ 
ness  of  these  categories  remained  to  be  established.  Relatively  little  de¬ 
pendence  candle  placed  upon  descriptions  of  psychological  traits  made  by 
untrained  personnel,  particularly  when  a  limited  and  informally  stand¬ 
ardized  vocabulary  is  used  to  characterize  failures. 

In  addition  to  the  categories  yielded  by  the  analysis  of  faculty-board 
proceedings,  early  job-analysis  information  indicated  rather  strongly  the 
importance  of  mechanical  information  and  mechanical  comprehension  for 
the  pilot.  The  mere  fact  that  the  airplane,  of  which  the  pilot  has  charge, 
is  an  extremely  complicated  mechanism  was  sufficient  basis  for  starting 
research  to  determine  the  pilot  validity  of  mechanical  tests. 

The  Navigator 

In  navigator  training,  the  one  thing  that  stood  out  in  even  the  most 
cursory  job  descriptions  was  the  importance  of  numerical  and  mathemati¬ 
cal  skills.  In  general,  the  task  of  the  navigator  seemed  to  call  for  the  same 
trails  that  arc  necessary  for  success  in  academic  pursuits.  While  this  con¬ 
clusion  has  been  somewhat  modified  by  subsequent  experience,  it  is  still 
true  that  the  navigator  is  the  most  academic  member  of  the  air  crew.  Cer¬ 
tainly,  tests  of  intellect  and  information  were  high  on  the  priority  list  for 
research  on  the  problems  of  navigator  qualifications. 

The  Bombardier 

Early  descriptions  of  tne  job  of  the  bombardier  were  so  meager  and 
so  conflicting  that  relatively  little  basis  was  furnished  for  test  construc¬ 
tion.  There  was  some  consensus  that  mechanical  tests  might  be  valuable 
in  the  selection  of  candidates  for  bombardier  training.  Since  later  studies 
showed  the  bombardier  criterion' to  have  little  reliability,  conflicting  re¬ 
ports  concerning  the  qualities  of  good  and  poor  bombardiers  were  to  be 
expected. 
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THE  CODE  NUMBER  SYSTEM 

Based  cn  the  available  job-analysis  information,  a  code-number  system 
was  established  for  tests  in  this  area.  Successive  hundreds  in  the  Cl  series 
are  assigned  as  follows: 

100  Information. 

200  Reasoning. 

300  Judgment. 

400  Foresight  and  Planning. 

500  Memory. 

600  Comprehension. 

700  Mathematics. 

800  Physics. 

900  Mechanical  Comprehension. 

Mathematics  and  physics  are  clearly  not  coordinate  with  the  other  cate¬ 
gories.  They  were  listed  separately  because  of  the  expected  volume  of 
tests  under  those  ^headings.  Since  later  test  construction  was  closely 
geared  to  validation  studies,  the  expected  volume  of  tests  in  certain  areas, 
e.  g.,  physics,  did  not  materialize.  There  are,  also,  relatively  few  informa¬ 
tion  tests  as  such.  Information  tests  have  been  designed  largely  as  inter¬ 
est  tests,  and  so  their  assigned  code  numbers  have  been  in  the  tempera¬ 
ment  (CE)  area. 

CHAPTER  ORGANIZATION 

The  organization  of  the  chapters  in  this  section  follows  the  coding 
system  with  relatively  few  exceptions.  The  closest  correspondence  be¬ 
tween  chapter  content  and  coding  system  is  for  the  following:  Chapter 
5,  Verbal  Ability  Tests,  all  in  the  comprehension  area ;  chapter  6,  Math¬ 
ematics  Tests;  chapter  7,  Reasoning  Tests;  chapter  9,  Foresight  and 
Planning  Tests ;  and  chapter  1 1,  Memory  Tests. 

Other  chapters  correspond  very  closely  to  the  logical  framework  of  the 
coding  system,  but  exhibit  minor  irregularities.  Chapter  8,  Judgment 
Tests,  includes,  besides  judgment  tests  per  tests  of  estimation  and  of 
fluency.  Construction  of  these  other  types  of  tests  was  based  on  hypothe¬ 
ses  concerning  the  unique  components  in  the  factorially  complex  act 
called  judgment.  Giaptcr  13,  Mechanical  Tests,  includes  a  discussion  of 
physics  tests.  The  physics  tests  were  too  few-  and  too  lacking  in  import¬ 
ance  in  the  test  construction  program  to  warrant  a  separate  chapter.  There 
is,  in  addition,  an  obvious  relationship  of  physics  tests  to  the  mechanical 
area. 

Giaptcr  14,  Information  Tests,  is  irregular  in  that  it  includes  tests  that, 
as  presumptive  measures  of  interests,  were  given  temperament  code  num¬ 
bers.  The  decision  to  include  these  tests  in  this  section  rather  than  later 
was  somewhat  arbitrary.  To  have  done  otherwise  would  have  divided 
information  tests  between  two  chapters.  The  factorial  content  of  these 
tests  is  prevailingly  intellectual  rather  than  temperamental,  in  spite  of  the 
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test  constructors’  intentions.  The  provision  for  an  information  category 
in  the  Cl  area  is  another  argument  for  including  the  information  tests  in 
this  section. 

Chapter  10,  Integration  Tests,  constitutes  otic  of  the  two  major  excep¬ 
tions  to  the  relationship  between  coding  system  and  chapter  organization. 
The  integration  concept  arose  from  two  different  sources.  The  ability  to 
integrate  the  influence  of  several  simultaneously  operative  elements  in  a 
situation,  all  of  which  bear  upon  the  choice  of  a  single  direction  of  action, 
seemed  an  apt  description  for  a  valid  pilot  factor  discovered  in  one  of  the 
early  factor  analyses.  Later  job  descriptions  also  furnished  additional  evi¬ 
dence  of  the  importance  in  pilot  training  of  this  hypothesized  ability.  Of 
the  available  categories,  comprehension  seemed  most  like  this  concept  The 
integration  tests  were  therefore  given  code  numbers  in  the  latter  half  of 
the  comprehension  series. 

The  second  exception  is  Chapter  12,  Visualization  Tests.  This  chapter 
is  close  to  the  borderline  between  this  section  and  the  one  on  perception. 
This  is  readily  apparent  from  the  variety  of  code  numbers  included  in  the 
chapter.  Historically,  the  first  visualization  test  in  the  program  was  con¬ 
structed  as  one  of  a  battery  of  mechanical-comprehension  tests.  The  vis¬ 
ualization  ability,  as  a  matter  of  fact,  seems  to  be  an  important  component 
of  many  seemingly  mechanical  tests.  In  a  later  battery  of  reasoning  tests, 
the  visualization  factor  again  was  found  to  be  prominent.  Tn  this  battery 
it  was  also  discovered  that  a  good  visualization  test  could  be  presented  en¬ 
tirely  in  verbal  terms.  Psychologically  considered,  visualizing  is  symbolic 
activity  rather  than  direct  response  to  sensory  stimulation.  These  various 
evidences  seemed  to  provide  sufficient  justification  for  including  the  vis¬ 
ualization  chapter  in  the  section  dealing  with  tests  of  intellect  and 
information. 

CONCLUDING  STATEMENT 

The  informed  reader  will  recognize  many  standard  tests  in  the  chapters 
to  follow.  Assignment  of  credit  for  construction  of  these  tests  is  based 
upon  the  work  necessary  to  adapt  the  tests  to  an  aviation-student  popula¬ 
tion  or  to  an  IBM  answer  sheet.  The  latter  task,  particularly,  often  re¬ 
quires  no  small  degree  of  ingenuity.  Because  of  the  great  amount  of  “till¬ 
ing”  the  field  of  intellectual  and  information  tests  has  had  since  the  first 
Binet-Simon  test  scale  was  published,  it  has  been  difficult  to  make  truly 
original  contributions  in  test  construction  per  se.  The  original  types  of 
tests,  and  there  are  several,  arc  all  the  more  gratifying  for  this  reason. 

These  chapters  contain,  in  general,  more  contributions  to  our  knowledge 
about  tests  and  the  abilities  they  measure  than  they  do  descriptions  tf 
original  tests.  It  is  believed  that  the  reader  will  be  impressed,  as  the  writer 
has  been,  with  the  need  for  a  redefinition  and  rcanalysis  of  general  intelli¬ 
gence  and  a  reconstitution  of  the  instruments  that  purport  to  measure  it 
Most  of  the  variance  of  standard  tests  of  intelligence  could  undoubtedly 
be  accounted  for  by  appropriately  selected  and  weighted  tests  taken  from 
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the  three  chapters  on  Verbal  Abilities,  chapter  5  ;  Mathematics,  Chapter  6; 
and  Reasoning,  chapter  7.  Yet,  such  a  combination  would  represent  at 
least  three  relatively  independent  abilities  in  the  3viation-student  popula¬ 
tion,  and  any  given  intelligence-test  scor-  would  therefore  represent  only 
one  of  several  possible  combinations  of  ability  levels  in  the  three  traits. 
If  differential  validities  of  these  traits  exist  for  various  job  criteria,  it  is 
obvious  that  needless  errors  in  prediction  are  made  by  using  tests  of  gen¬ 
eral  intelligence.  And,  although  the  user  of  such  tests  may  thoroughly 
understand  that  no  claims  can  be  made  for  a  complete  coverage  of  human 
traits,  he  is  likely  to  neglect  many  important  abilities  because  of  the  social 
importance  at  present  attached  to  intelligence.  A  more  analytical  approach 
would  make  such  neglect  virtually  impossible. 
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CHAPTER  FIVE 


Verbal  Ability  Tests1 


INTRODUCTION 
Army  Aviation  Selection  in  Prewar  Year* 

The  development  of  tests  of  verbal  ability  was  a  natural  outgrowth  of 
studies  carried  on  before  the  emergence  of  the  aviation-classification  pro¬ 
gram.  As  was  stated  in  chapter  4,  beginning  in  1927,  2  years  of  college 
training,  or  the  equivalent,  were  required  for  acceptance  of  applicants  for 
flying  training.  Those  who  had  not  satisfied  the  college  requirement  could 
qualify  by  passing  a  special  examination  on  nine  college  subjects.  This 
early  emphasis  on  scholastic  achievement  as  a  criterion  for  selection 
proved  to  be  a  forerunner  of  the  Reading  Comprehension  and  Vocabulary 
tests  which  were  later  employed  in  air-crew  classification. 

The  AAF  Qualifying  Examination 

The  first  two  parts  of  the  initial  form  of  the  Qualifying  Examination 
are  Vocabulary  and  Reading  Comprehension.  The  following  reasons  were 
given  for  the  inclusion  of  the  vocabulary  section : 

The  purpose  of  the  vocabulary  section  is  to  make  possible  the  selection  of  men 
who  have  Rood  general  intelligence  and  are  able  to  comprehend  and  understand 
written  directions.  Vocabulary  tests  have  been  found  to  predict  the  ability  to  under¬ 
stand  and  remember  the  sort  of  material  that  is  covered  in  air-crew  ground  schools, 
where  the  student  must  remember  what  lie  reads  and  hears  (2). 

The  following  reasons  were  given  for  the  inclusion  of  the  reading 
comprehension  section: 

The  purpose  of  this  section  is  to  select  individuals  who  can  read  and  comprehend 
the  sort  of  material  that  they  must  study  and  apply  in  aviation  training.  This  sec¬ 
tion,  like  the  vocabulary  section,  is  a  measure  of  general  and  intellectual  ability  (2). 

Statistical  results. — Statistical  results  soon  revealed  that  the  different 
parts  of  the  test  were  of  varying  importance  for  predicting  success  in 
pilot,  bombardier,  and  navigator  training.  The  vocabulary  section  was  of 
special  value  only  for  the  prediction  of  success  in  navigation  school  (sec 
table  5.1). 

The  reading-comprehension  section  showed  a  positive  correlation  with 
success  in  pilot  training  and  a  very  high  correlation  with  success  in  navi¬ 
gation  (raining.  In  addition,  it  was  the  most  effective  part  of  the  exami¬ 
nation  for  predicting  success  iq  bombardier  training  (sec  table  5.1). 

•  Wftttrt  Wy  T/Sft.  S «nf«rd /.  M*tk. 
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Tabue  5.1. —  Validation  data  for  Vocabulary  and  Reading  Comprehension  tests 

for  air-crew  training 


Group 

T«»t 

N, 

1  M, 

M. 

SD, 

riit 

Pilot*1  . 

5'$ 

0  <4 

2%  90 

29 

7  1 6 

— 0  04 

Pilot** . 

Reading  comprehension 

545 

,:>9 

12.49 

1 2  0J 

2.02 

.14 

Navigator** 

Vocabulary  . 

221 

.79 

31.87 

27.85 

7.03 

.32 

Navigator** 

Reading  comprehension 

221 

.79 

13.05 

11.13 

2.1S 

.52 

Bombardier* 

Vocabulary  , . . 

191 

(*)r 

29.56 

27.39 

6.09 

.18 

Bombardier* 

Reading  Comprehension 

191 

(*)c 

12.38 

11.06 

2.20 

.31 

■In  eta»»  42-G.  Tested  at  Psychological  Research  Unit  No.  I,  January  1942. 

•In  cla'-e*  42  6,  42-7,  ami  42  8.  Tested  at  Psychological  Keseirch  Unit  No.  1,  January  1942. 
•  Not  reported. 


Verbal-Ability  Requirements  of  Air  Crew 

Knowing  the  complex  nature  of  the  airplane  and  its  operations,  we  may 
reasonably  suppose  that  greater-than-average  intellect  would  be  required 
for  success  as  an  air-crew  ofTiecr.  All  air-crew  members  become  officers 
on  attaining  their  wings.  Some  attention,  therefore,  had  to  be  paid  to  the 
selection  of  potential  commanders — men  possessing  superior  leadership 
ability  and  intellect.  It  was  logical,  therefore,  to  seek  a  known  measure 
of  so-called  general  intelligence.  This  led  at  once  to  tests  of  verbal  ability 
and  comprehension,  for,  as  Bingham  (I)  points  out: 

Without  recourse  to  language,  the  processes  of  comparison,  abstraction,  general¬ 
ization,  and  mental  organization  would  be  limited  indeed.  With  the  aid  of  verbal 
symbols,  we  can  more  easily  wrestle  with  problems,  manipulate  meanings,  and  test 
possible  solutions  of  our  difficulties  mentally  before  we  act.  Little  wonder,  then, 
that  a  good  test  of  vocabulary  is  of  use  as  an  indirect  measure  of  a  person’s  verte! 
or  conceptual  intelligence,  and  for  two  reasons:  First,  the  richer  his  store  of 
words  and  meanings,  the  better  his  equipment  for  solving  some  of  his  problems 
promptly  and  correctly,  that  is,  for  showing  intelligence;  second,  the  more  in¬ 
telligent  he  has  been  since  infancy,  the  greater  the  likelihood  that  he  has  gr:ncd  com¬ 
mand  of  a  wide  variety  of  correct  word  meanings.  Intelligence  is  far  from  being 
identical  with  the  power  to  read  understanding^,  to  speak  aptly,  or  to  write  coher¬ 
ently  and  concisely.  But  the  reciprocal  relations  between  mastery  of  the  mother 
tongue  and  ability  to  think  intelligently  should  be  obvious. 

Need  for  language  proficiency  in  ground  school. — Although  a  poor 
showing  in  ground-school,  courses  might  not  be  sufficient  basis  in  itself 
for  elimination  from  primary  pilot  training,  nevertheless,  failure  to  grasp 
the  theoretical  concepts  of  flight  would  surely  limit  an  individual’s  under¬ 
standing  of  the  function  of  an  airplane  and  would  possibly  affect  his  per¬ 
formance  in  the  air  deletcriously.  Bingham  '1)  points  out  a  common 
cause  of  failure  in  school  subjects. 

Mention  has  been  made  of  yet  another  danger  signal:  Poor  ability  in  English. 
A  lack  of  equipment  in  the  verbal  tools  of  thought,  revealed  by  low  scores  in  tests 
of  vocabulary  and  of  English  usage,  may  signify  cither  insufficient  training  in  the 
clear  and  precise  use  of  language,  or  a  shortage  of  verbal  intelligence  without  which 
it  is  difficult  to  miMer  college  subjects.  •  •  •  The  candidate’s  previous  school 
achievements  and  his  performance  in  scholastic  aptitude  tests  furnish  evidence  re¬ 
garding  his  general  mental  ability. 
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I  he  navigation  course  includes  much  theoretical  technical  material  that 
must  Ik-  read  and  comprehended,  in  such  subjects  as  calibration,  radio 
navigation,  and  celestial  navigation.  Intelligence  is  required  in  navigation* 
study  to  infer  indirect  meanings  from  stated  facts.  It  is  reasonable  to 
suppose  that  an  adequate  measure  of  this  ability  would  lx*  obtained  from 
a  reading-comprehension  test.  The  striking  difference  between  pilot  and 
navigator,  in  particular,  as  shown  <n  table  5.1,  promised  one  basis  for 
discrimination  of  aptitudes  and  hence  for  classification. 

Summary 

To  summarize,  tests  of  verbal  ability  were  incorporated  in  the  Aviation 
Cadet  Classification  Program  because  of  work  done  and  results  achieved 
with  the  verbal  sections  of  the  AAF  Qualifying  Examination,  because  of 
the  hypothesis  that  individuals  of  high  general  and  verbal  intelligence  arc 
required  to  master  the  complexities  of  airplane  operation  and  training, 
and  because  bombardier,  navigator,  and  pilot  differ  in  the  requirements  in 
this  respect. 

VOCABULARY  TESTS 

Cooperative  Vocabulary  Test,  Form  R,  060 1A,  CI605A 

The  Cooperative  Vocabulary  Test,  Form  R,  was  published  by  the 
Cooperative  Test  Service  in  1941.  It  was  included  in  the  first  classification 
battery  during  the  winter  and  early  spring  of  1942. 

Description.- — The  two  code  numbers,  C1604A  and  CI605A,  refer  to 
the  two  different  scores  that  were  obtained  from  this  test.  CI604A  refers 
to  the  Icvcl-oi -comprehension  score,  whereas  CI605A  refers  to  the  speed -• 
of -comprehension  score.  The  items  are  of  a  difficulty  level  appropriate  for 
examinees  with  approximately  2  years  of  college  training. 

(1)  Internal  characteristics. — The  test  contains  210  items  arranged  in 
blocks  of  30.  Tlie  30  items  with  the  highest  internal  consistency  are  pre* 
sented  on  the  first  page,  the  30  with  the  next  highest  internal  consistency 
are  presented  on  the  second  page,  and  so  on,  in  what  is  technically  known 
as  "cyclical  construction."  Items  are  not  arranged  according  to  difficulty. 

(2)  Administration. — The  directions  instruct  the  examinee  to: 

♦  •  •  Answer  all  the  items  you  can  on  each  pape  before  pomp  on  to  the  next. 
Answer  the  items  as  they  come:  be  careful  nut  to  .diip  papes.  This  is  not  a  speed 
test,  and  your  score  does  not  d  -|>end  as  much  on  Imw  many  items  you  tryvto  answer 
as  it  docs  on  how  many  you  pet  ripht  on  each  j*apc  you  attempt.  On  the  other 
land,  the  accuracy  of  your  score  will  be  decreased  if  you  s|>cnd  too  much  lime  on 
any  one  pape.  •  •  • 

The  time  limit  suggested  by  the  publishers  for  many  purposes  is  30 
minutes. 

Following  arc  two  typical'' items.  The  examinee  is  asked  to  indicate 
which  of  the  five  alternative  words  is  closest  in  meaning  to  the  key  word. 
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Soothsayer:  Denote: 

1  Si>takcr.  1  Regard. 

2  Prophet.  2  Write. 

3  Comforter.  3  Indicate. 

4  Singer.  4  Refuse. 

5  Peacemaker  5  Declare. 

(3)  Scoring. — As  mentioned  previously,  two  scores  arc  obtained  for 
each  examinee.  Both  scores  are  obtained  by  application  of  the  formula 
R — W/4  and  the  use  of  a  conversion  table  yielding  scaled  scores. 

Statistical  results. — (1)  Distribution  statistics. — The  distribution  of 
level-of- comprehension  scores  in  the  test  is  indicated  by  a  mean  of  61.1 
and  a  standard  deviation  of  9.8,  based  on  a  sample  of  225  unclassified 
aviation  students.  For  the  specd-of-comprehension  scores,  a  mean  of  101.5 
and  a  standard  deviation  of  41.9  was  found  for  a  sample  of  243  unclas¬ 
sified  aviation  students.  These  two  samples  were  tested  on  March  2  and 
March  4,  1942,  at  Psychological  Research  Unit  No.  1. 

(2)  Reliability.— A  reliability  coefficient  of  0.83  was  obtained  by  the 
test-retest  method  for  a  sample  of  438  pilot  eliminees.  This  sample  was 
tested  in  May  and  June,  1942,  at  Psychological  Research  Unit  No.  3,  and 
was  retested  2  or  3  months  later. 

(3)  Test  validity. — Validation  data  arc  presented  in  tables  5.2  and  5.3. 

Table  5.2. —  Validity  Data  for  Cooperative  Vocabulary  Test,  Form  R,  C 1604 A 

(Level) 


Croup 

Criterion 

D 

fl 

M. 

SD, 

PiloM  in  primary  training 

Crail’iation  elimination 

5*7 

0.60 

>6.14 

>6  27 

>1.74 

-0.06 

Bombardiers*  . 

Rcconl  circular  error 

238 

.... 

....  ! 

....  ! 

9.7 

•.02 

Rccon!  circular  error 

194 

64.5' 

64.5' 

7.9 

•-.12 

<  Graduation  elimination 

194 

.88 

9.2 

.00 

Crailuation  elimination 

228 

.79 

61.5 

64.2 

9.0 

.02 

Kiri|iier>'  . . 

(Graduation  elimination 

183 

.84 

65.5 

60.3 

9.0 

.32 

•  In  term*  of  scaled  score*  with  a  mean  of  S  CO  am!  a  vtandard  deviation  of  Z.OO. 

* TcMrJ  in  the  |*-nod  April  through  .\u*u>t  1912  at  ’Njfcholot ..^1  Rc.carch  Unit  No.  I. 

•  iVoHiKt  nHimtnl  correlation*.  ... 

■  Revord  circular  error  it  an  unreliably  criterion.  Various  estimates  of  its  reliability  vary 
between  0  DO  to  0  40.  t  ■ 

•Te»i*tl  in  the  i*erio«l  Fchrtury  through  April  I *>12  at  Psychological  Re*nrcn  Unit  No.  *. 
•In  4*  10  lo  42-17,  Southeast  Training  Center.  Tctrd  in  the  period  February  through 

April  1942  at  ISycholovical  Research  Unit  No.  I, 

•  Kecia*?ified  pilot*,  re r  testing  data  and  cla**es  see  footnote  7. 

Table  5.3.--  'alidi'y  data  for  Cooperative  I’oeabulai  y  Test,  Form  R,  CI605A 

(Speed) 


Croup 

Criterion 

B 

B 

sn, 

r... 

PdoM  in  primary  tiaminr1 
Navigator  %*  . . I 

elimination 
(Graduation  elimination 
Record  circular  error 

5*7  1 

104 

237 

0.A0 

.35 

•>3.45 

109.1 

97  50 
102.1 

26  85 
31  2 

35  S 

-09 

.11 

■01 

*  Trnrd  April  tiroiick  Atuutl  14,  19*2  *t  Kljc  WoU«>cit  Rotarck  Unit  No.  3.  IikIikJc*  h* 
aviation  cadet*  and  rnliutW  pilot*. 

1  Fear  mo  product  moment  correlation. 

Evaluation. — From  tables  5.2  ami  5  3  it  appears  that  \ocabulary  tests 
have  slightly  negative  validity  for  pilots  (for  both  level  and  speed  scores). 
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Bombardie  r  validity  is  approximately  zero,  though  this  conclusion  must  be 
taken  w  ith  n  s<  nation,  due  to  die  umvliabilitv  of  the  criterion.  I  he  vc*cal>- 
uiavy  test  oltcred  promise  only  as  a  navigator  selection  instrument,  and 
even  navigator  validities  were  rather  low  atul  variable  in  these  small 
samples.  It  may  be  concluded  that  within  the  range  of  ability  of  aviation 
students,  verbal  intelligence  is  not  a  factor  for  success  in  training  except 
to  a  small  extent  fur  navigation.  As  an  incidental  comment,  it  may  be  said 
that  the  intercorrelation  of  speed  am)  level  scores  is  quite  high  (r  ~ 
0.84). 

Vocabulary  Test  (AAF),  CI601I1 

This  test  replaced  Cooperative  Vocabulary  Test,  Form  R,  C1604A, 
C 1605 A,  in  the. classification  battery  during  the  late  spring  and  summer 
of  1942.  The  Vocabulary  Test  (AAF)  was  a  reprint  of  a  commercial 
test  prepared  by  the  Cooperative  Test  Service. 

Description. — The  Vocabulary  Test.  AAF,  contains  words  of  appro¬ 
priate  difficulty  for  men  with  approximately  2  years  of  college  training 
Thus,  the  difficulty  is  comparable  to  the  preceding  Cooperative  Vocahu-/ 
lary  Test,  Form  R.  _J" 

(1)  Internal  characteristics. — The  test  contains  150  items,  constructed 
in  blocks  of  30.  The  technique  of  cyclical  construction  is  also  employed 
here.  The  first  30  items  are  those  with  the  highest  internal  consistency, 
the  next  30  items  are  those  with  the  next  highest  internal  consistency,  and 
so  on.  The  words  are  not  arranged  in  order  of  difficulty. 

(2)  Administration. — The  time  limit  is  15  minutes.  Approximately  3 
minutes  are  required  for  the  simple  directions  which  specify  that,  "*  *  * 
this  is  not  a  sjned  test.  Yohr  score  do  s  not  dejx-nd  so  much  on  how  mat  y 
items  you  try  to  answer  as  it  does  on  how  many  you  get  right  on  each  page 
you  attempt.”  Answers  are  marked  directly  on  a  standard  five-place  IBM 
answer  sheet. 

(3)  Scoring — In  spite  of  the  admonitions  in  the  directions  to  the  con¬ 
trary,  the  Vocabulary  Test,  AAF,  is  scored  on  a  speed  basis,  in  that  only 
the  spced-of-comprchonsion  score  is  obt.vmd.  Hie  scoring  formula  is 
R— W/4. 

Statistical  results. — (1)  Distribution  statistics  —For  a  sample  of  1.000 
unclassified  aviation  students  (tested  at  Psychological  Research  Unit  No. 
3  in  OctoU-r  and  November  19431,  tin-  mean  score  was  48.1  and  the 
standard  deviation  19.9. 

(2)  Internal  consistency.— “The  internal  comiMi  ncy  of  81  of  the  items 
is  indicated  l»y  a  mean  phi  of  0  33.  wbh  a  range  from  -0.12  to  4  0  88, 
and  a  standard  deviation  of  0  20,  based  on  the  highest  and  lowest  2 7r,c  of 
200  unclassified  aviation  students. 

(3)  Difrcully. — The  difficulty  level  of  items  is  indicated  by  a  mean 
proportion  of  correct  responses  equal  to  0  67,  corrected  for  clt.ancc  suc- 
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cess.  The  proportions  ranged  from  0.00  to  1.00  with  a  standard  devia¬ 
tion  of  0.27. 

(4)  factorial  composition. — In  one  analysis,  the  only  significant  load¬ 
ing  appeared  in  the  verbal  factor,  which  this  test  helped  to  define.  The 
loading  was  0.71.  It  is  practically  a  pure  test  for  the  verbal  factor,  with 
about  50%  of  its  variance  so  allocated. 

(5)  Test  validity. — Validity  data  are  given  in  table  5.4. 


Taiii.k  5.4. —  Validity  data  for  Vocabulary  Test,  AAF,  CI604B 


Group 

Criterion 

D 

M, 

M. 

SI), 

•’at. 

crH« 

IMota  in  primary 
training1  . 

Gradual  ion-elimination 

528 

0.87 

39.98 

46.34 

20.04 

-0.17 

-0.14 

Pilots  in  primary 

training  . 

Pilot*  through  bade 
training  . 

Graduation-elimination 

2.658 

.73 

Graduation  elimination 
Record  circular  error 
Graduation  elimination 

1.942 

320 

.89 

22  9 

OtiTl 

Navigators' . . 

171 

.94 

69.7 

62.3 

23.9 

.15 

*  AsMiming  an  unrestricted  Manine  standard  deviation  of  2.00. 

1  In  classes  44-11,  44-1,  and  44— J.  Tested  at  Psychological  Research  Unit  No.  3. 

*  Product-moment  r. 

4  Classified  in  the  period  Apr.  I  to  Aug.  14,  1942  at  Psychological  Research  Unit  No.  3. 
Includes  new  aviation  cadets  and  eliminated  pilots. 

Evaluation. — The  Vocabulary  Test,  AAF  (C1604B),  is  adequate,  in 
terms  of  level  of  difficulty  and  reliability,  for  use  in  classification  of  air 
crew.  It  has  a  definite  contribution  to  make  to  the  selection  of  navigators, 
none  for  bombardiers,  and  for  pilots  it  might  well  carry  a  small  negative 
weight.  One  might  question  the  wisdom  of  adverse  selection  of  pilots  on 
any  trait  as  important  as  verbal  intelligence,  however.  Even  if  such  selec¬ 
tion  did  improve  graduation  rate  in  training,  it  might  work  against  sclec- 
{s  tion  of  potentially  superior  plane  commanders. 

In  a  battery  where  measurement  of  the  verbal  factor  is  required,  a 
vocabulary  test  is  strongly  to  be  recommended.  Taken  by  itself,  it  is  not 
as  valid  for  selecting  navigators  as  other,  impure,  verbal  tests  such  as 
reading  comprehension.  Where  uniqueness  is  a  requirement,  however,  the 
vocabulary  test  has  no  rivals  for  the  purpose  of  assessing  verbal-compre¬ 
hension  ability. 

READING  COMPREHENSION  TESTS 

Reading  Comprehension  (Trnining  and  Duties) ,  CI606A  * 

This  is  the  first  form  of  the  Reading  Comprehension  Test  in  the  clas¬ 
sification  battery.  The  paragraphs  and  questions  concern  the  training  and 
duties  of  the  navigator,  pilot,  ..ml  bombardier  for  a  special  reason.  In  the 
early  months  of  1942,  the  roles  of  the  navigator  and  bombardier  in  the 
air  crew  had  not  been  extensively  publicized.  Consequently,  most  of  the 
examinees  were  familiar  only  with  the  pilot’s  job.  Few  examinees  were 
indicating  first  preference  for  navigator  or  bombardier  training.  It  was 
felt  that  if  information  about  all  the  air-crew  positions  were  presented 
through  the  medium  of  this  test,  the  number  of  stated  preferences  for 

*  Developed  at  Office  of  the  Air  Surgeon,  Headquarter,  AAF.  Chief  contributor:  U.  CoL 
Uaurance  r.  Shaffer, 
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navigator  and  bombardier  training  would  increase*  Therefore,  the  train¬ 
ing  and  duties  test  was  administered  before  the  preference  blank  (see 
chap.  26),  on  which  examinees  declared  their  first,  second,  and  third 
choices  for  types  of  air-crew  training.  It  was  believed  that  if  examinees 
were  tested  on  the  paragraphs,  they  would  become  more  highly  aware  of 
their  content.  It  was  also  believed  that  a  verbal-comprehension  score 
would  be  useful  in  selecting  navigation  students. 

Description. — The  training-and-duties  paragraphs  are  simple  descrip¬ 
tions  of  jobs,  the  training  involved,  and  the  individual  characteristics  re¬ 
quired  for  success.  The  attempt  to  glamorize  the  roles  of  navigator  and 
bombardier,  while  playing  down  the  pilot,  is  obvious  in  the  construction 
of  the  paragraphs.  Following  are  some  of  the  things  said  about  the 
navigator  and  his  job: 

The  aviation  cadet  who  is  to  become  a  navigator  embarks  upon  a  career  in¬ 
volving  the  most  modern  application  of  one  of  the  oldest  of  all  the  sciences 
*  *  *  The  extensive  bombing  experience  of  the  present  war  has  made  dear  the 
extremely  difficult  and  important  task  of  the  navigator.  It  is  his  job,  by  day  or  by 
night,  to  chart  the  course  that  the  bomber  must  fly  from  its  base  to  the  objective 
to  be  bombed  and  back  to  the  home  1-  e.  The  navigator  must  be  a  person  of 
superior  intelligence,  with  a  passion  tor  ..  -uracy  and  the  power  of  logical  reason¬ 
ing  under  conditions  requiring  speed,  coolncs-,  and  precision. 

The  bombardier,  likewise,  is  played  vp  as  an  extremely  important 
member  of  the  crew. 

The  military  value  of  a  bombing  plane  is  no  greater  than  the  ability  of  its  bom¬ 
bardier  to  place  his  bombs  on  a  military  target.  Tbc  bombardier  should  have  as 
much  intelligence  as  anj  member  of  the  crew,  and  should  possess  unusual  maturity 
of  judgment  and  ability  to  accept  responsibility;  in  addition,  he  must  have  the  best 
of  vision  to  pick  out  his  target  from  great  height;  lie  must  have  superior  muscle 
coordination  to  make  delicate  adjustments  on  the  bomb  sight,  and  be  must  remain 
calm  and  steady  under  combat  conditions. 

(1)  Internal  characteristics. — The  test  contains  30  scored  items,  10 
devoted  to  each  jcD  description.  Most  items  are  extremely  easy  and  seem 
to  serve  the  purpose  of  emphasizing  the  ideas  presented  in  the  paragraphs 
as  well  as  that  of  testing. 

(2)  Administration.— The  instructions  for  this  test  are  simple  and  are 
all  contained  in  the  test  booklet.  Examinees  are  told  1o  “Ease  your  an¬ 
swers  on  the  reading  material  or  on  inferences  which  can  be  drawn 
from  it.” 

Thirty  minutes  arc  allowed  for  completion  of  the  test. 

Following  arc  two  typical  items : 

The  member  of  die  air  crew  who  must  be  most  apt  in  mathematics  is: 

The  bombardier. 

B.  The  navigator. 

C.  The  pilot 

D.  The  radio  operator. 

EL  The  gunner. 

> 
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The  major  offensive  force  of  the  air  arm  is: 

A.  The  bomber  plane. 

H.  The  fighter  plane. 

C.  The  interceptor  plane. 

D.  The  observation  plane. 

K.  The  pursuit  plane.  . 

(3;  Scoring. — 'Hu*  scoring  formula  is  R — W/4. 

Statistical  results. — Fundamental  data  are  quite  complete  on  this  test 
but  with  relatively  small  samples.  All  the  data  given  below  are  for  exami¬ 
nees  tested  in  March  1942  at  I’sychological  Research  Unit  No.  1. 

(1)  Distribution  statistics. — The  distribution  of  scores  in  this  test  is 
indicated  by  a  mean  score  of  19.8  and  a  standard  deviation  of  5.1,  based 
on  a  sample  of  135  unclassified  aviation  students.  The  distribution  was 
markedly  negatively  skewed. 

(2)  Internal  consistency. — The  internal  consistency  of  items  is  indi¬ 
cated  by  a  mean  phi  of  0.28  with  a  range  from  0  00  to  0.60  and  a  standard 
deviation  of  0.15,  based  on  the  highest  27%  and  the  lowest  27%  of  200 
unclassified  aviation  students. 

(3)  Reliability  coefficient. — A  reliability  coefficient  of  0.86,  corrected, 
was  obtained  by  the  odd-even  method  on  a  sample  of  135  unclassified 
aviation  students. 

(4)  Difficulty. — The  difficulty  level  of  items  in  the  te;’  is  indicated  by 
the  mean  proportion  of  correct  responses  equal  to  0.80,  corrected  for 
chance  success.  The  proportions  ranged  from  0.47  to  1.00  with  a  stand¬ 
ard  deviation  of  0.18.  These  statistics  are  based  on  the  data  for  200  un¬ 
classified  students. 

(5)  Test  validity.— Validity  data  arc  shown  in  table  5.5. 


Taiile  5.5. —  Validity  Data  for  Rcadwij  Comprehension,  C1606A 


Group 

^  Criterion 

N. 

M. 

M, 

sn, 

Pilots  in  primary  tra»nii.< 

Graduation  elimination 

a4> 

o.r.o 

>6.00 

>5.o0 

1.92 

0.06 

2 18 

3.8 

*.00 

Navigators*  .  ... 

1 ir.ulvtition 

19  i 

.  ’S 

Yw‘ 

23.3 

4.0 

.31 

*  In  ler m>  of  Mild  score-,  with  a  menu  of  S.IM)  ami  a  standard  deviation  of  2.00. 

*  New  aviation  cadet-1  ami  reclassified  pilot:-,  rlas-  lied  at  IVycl.olonical  Research  Unit  No. 


3  from  Apr.  i  to  Auk.  14.  1942. 

1  Product-moment  correlation. 

['.valuation. — In  view  of  the  fact  that  the  first  form  of  Reading  Com¬ 
prehension,  C'GC-jA,  was  dev-  'oped  p:  ma.  ily  to  inciease  the  number  of 
preferences  for  navigator  and  bombardier  training  and  was  clearly  de¬ 
ficient  with  regard  to  difficulty  level,  the  statistical  data  are  not  very  re¬ 
vealing.  The  easiness  of  the  items  is  indicated  by  the  unusually  low 
difficulty  level  (0.80,  corrected  for  dunce.).  The  test  d<  s  not  appear  to 
be  valid  for  the  pilot  training  criterion  or  the  bombardier  circular  error 
criterion.  Its  navigator  validity  on  one  small  sample  is  fairly  satisfactory. 
Reading  Comprehension  Test*  CI611C  3 

This  is  the  seventh  revision  of  a  Reading  Comprehension  Test  based  on 

*  Developed  at  Psychological  Research  Unit  No.  J.  Chief  contributors:  Cap!  Uoyd  C». 
Humphreys,  Maj,  Merrill  r.  Ruff,  and  Lt.  Mahlon  B.  Smith, 
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material  in  two  test  booklets  prepared  in  the  Office  of  the  Air  Surgeon  in 
June  1942.  In  December  1942,  it  was  placed  in  the  classification  battery. 

Description. — The  paragraphs  and  the  questions  and  answers  used  were 
carefully  selected  through  item  analyses  of  a  large  amount  of  trial  mate¬ 
rial  (the  six  previous  forms).  Two  of  the  paragraphs  were  composed  of 
material  that  might  logically  appeal  to  pilots,  two  others  to  navigators,  and 
two  others  to  bombardiers.  The  first  paragraph  deals  with  the  subject  of 
the  rotating  gun  turret.  The  second  paragraph  describes  the  thrust  a  I'd 
torque  forces  resulting  from  movement  of  the  constant-speed  propeller. 
Paragraph  three  discusses  the  north  celestial  pole  and  its  relationship  to 
various  geographical  positions  on  the  earth’s  surface.  Paragraph  four  de¬ 
scribes  the  reasons  for  the  drift  of  a  projectile.  The  fifth  paragraph  in¬ 
volves  a  description  and  evaluation  of  the  Mercator  projection.  The  sixth 
and  final  paragraph  tells  of  the  formation  and  control  of  carburetor  ice. 

(tj  Internal  characteristics. — This  test  contains  30  scored  items  based 
on  6  paragraphs.  Four  to  six  items' pertain  to  each  paragraph. 

(2)  Administration. — The  directions  specify  that  30  minutes  are  al¬ 
lowed  for  completion  of  the  test.  The  administrator  gives  a  time  warning 
at  the  end  of  10  minutes  and  again  at  20  minutes.  Answers  are  marked 
on  the  standard  five-place  IBM  answer  sheet.  Following  arc  two  typical 
items,  each  pertaining  to  a  different  paragraph: 

The  turret  always  moves: 

A.  360  degrees. 

B.  At  an  increasing  speed. 

C  In  a  circular  path.  ^ 

D.  When  the  hand  crank  is  turned. 

E.  When  the  clutch  level  is  in  tlte  down  position. 

How  does  a  Mercator  projection,  as  compared  to  a  globe,  change  the  relative 
sizes  of  Norway  and  Spain? 

A.  There  is  no  change  in  size  since  the  Mercator  projection  is  conformal. 

B.  Only  the  relative  length  of  Norway  is  increased. 

C  Only,  the  relative  width  of  Spain  is  increased. 

D.  Only  the  relative  width  of  Spain  is  decreased.  ' 

E.  The  relative  area  of  Norway  is  increased. 

(3)  Scoring. — The  formula  used  in  scoring  Reading  Comprehension 
CI6I4G  is  2R — W/2,  which  is  equivalent  to  R — W/4.  F.mpirical  studies 
of  the  optimal  weight  for  W  when  the  weight  for  R  is  unity,  results  in 
the  conclusion  that  the  formula  R — W/7  is  best  for  pilot  selection.  In 
samples  of  1,096  and  1,226  pilots  in  primary  training  the  empirical 
weights  arc  —0.144  and  —0.151,  respectively.  The  validities  for  pilots  to 
be  expected  from  this  formula  yielded  gains  of  only  0.001  in  both  in¬ 
stances,  however,  so  no  change  in  the  traditional  formula  is  called  for. 
No  corresponding  data  for  navigators  are  available.  Since  the  test  is  pri¬ 
marily  a  navigator-selection  instrument,  any  modification  of  scoring 
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formula  should  be  made  in  accordance  with  similar  studies  for  that 
specialty. 

Statistical  results. — (1)  Distribution  statistics. — The  distribution  of 
scores  in  this  test  is  indicated  by  a  mean  score  of  19.3  and  a  standard 
deviation  of  11.9,  based  on  a  sample  of  1,095  unclassified  aviation  stu¬ 
dents,  tested  with  the  December  1942  Classification  Battery,  at  Psycho¬ 
logical  Research  Unit  No.  1. 

(2)  Internal  consistency. — The  internal  consistency  of  items  is  indi¬ 
cated  by  a  mean  phi  of  0.43,  with  a  range  from  0.24  to  0.73,  and  a  stand- 


Taui.e  5.6.— Validity  data  for  Reading  Coint'relicnsion  Test,  CU614G 


Group 

1  Criterion 

1 

\  V 

!  *  I 

p# 

M, 

|  M,  j  SD, 

1  r»ta  •r*u* 

1  J  1 

i 

1  i 

Pilots  in  primary  !  . 

training’  . Graduation-elimination 

Pilot*  through  basic  I  .... 

training1  . Graduation-elimination 

Pilots  in  advanced  I 

single  engine  I 

training4  . Average  daily  grades*  . 

Pilots  in  advanced  ] 

twin-engine  | 

training'  . \vcrage  daily  grades*  . 

Pilot*  in  B-17  | 

transitional  training*  .Graduation-elimination 
Pilots  in  B-24  I 

transitional  training*  Graduation-elimination 
Pilots  in  B-2S  I 

transitional  training*  Graduation-elimination 
Pilots  in  B-2t 

transitional  training*  Graduation-elimination 

Bombardiers’  . Graduation-elimination 

Navigators'*  . Graduation-elimination 

Navigators"  . tirades  in  dead 

.  reckoning  (ground 

!  school)  . ...» 

Navigators"  . 'Grades  in  celestial 

i  navigation  (ground 

I  school)  . 

Navigators"  . . Grades  in  dead 

|  reckoning  (flight)  . 

Navigators"  . ;  Grades  in  celestial 

|  navigation  (flight) 

Navigator*"  . ’Grades  in  meteorology 

Navigators"  . Military  grades  ..... 

Navigators"  . Final  Coiuiio-ite  Grade 

Radio-operator  Graduation-elimination 

mechanics  in  | 

training"  . Composite  grades  .... 

Radio-operator 


,779 
1.046  j 

277  ! 

360 


0.88 

.S7 


22.87  19.08 


19.91 


mechanic*  in 

training"  . Final  academic  grades 

Air  mechanics  in 


I. 


•unners  in  training'' 


percent 


16.79 


11.13 

11.12 


1,046  | 

.98  j 

21.20 

15.88  : 

11.87  j 

933 

.92 

20.80 

17.51 

11.62  j 

313 

.98 

21.39 

23.7! 

11.66  I 

380 

1.829 

732 

46J 

463 

463 

463 

463 

463 

463 

2JS 

.82 

.79 

.87 

22.76 

16.51 

24.10 

I8.S5 

14.79 

17.12 

11.55  ] 
10.26 
11.76 

•  •  • 

!!"!  i 

’.65 

15.75 

17.09 

10.70 

153 

232 

0  o  o 

ooo 

»  •  •  o  • 

11.16 

i  194 

1 

•  O  0 

11.6 

0.18 

.18 

*.0J 

•00 

.18 

.14 

-.08 

.18 

.10 

.32 

“28 

19 

“.09 

“18 
”30  i 

”.io  : 
“.26  ! 
-.08 

”.39 

”41 

“.00 


0.20 

.23 

.07 

.02 


.IJ 

.47 


.42 

.31 

.18 

.29 

.41 

.16 

.41 


*  Assuming  an  unrestricted  Matiine  standard  deviation  of  2.00, 

'  In  cta-s  44-F.  Te-led  al  Psychological  Research  Unit  Nos.  1,  2,  and  3. 

*  In  cta'x  4J  J.  Texted  at  Psychological  Kexearch  Unit  No*.  I,  2,  and  J. 

*  At  Kuxter  and  Moure  Field*.  Te.-ted  between  July  22,  I94J  and  Oct.  JO.  I94S  at  P*yckd- 
lofftcal  Rex-arch  Unit  No,  2. 

•Converted  into  normaliied  Manine  *core*. 

*  /  avrraerd  (averagr  of  roeffuirtif*  for  2  school*)  product  moment  correlation!. 

*  Al  Fllmcton  an*!  FfrMrritit  Field*.  Toted  between  July  22,  194 J  and  Oct.  J0t  194 J  at 
r KoIo^k aJ  Krxjrch  Unit  No.  2. 

•In  cIjxxb  4 J  J  and  4i  K.  Tested  at  Psychological  Research  I  nil*  No*.  1,  2.  and  J. 

•In  cUura  41-i,  4J-9,  4J-I0,  and  4J-II.  Toted  ai  Psychological  Research  Unit*  No*.  1,  2, 

and  J 

"In  c!at*e*  41-10  and  43-11.  Te.led  si  Psychological  Research  Units  Nos.  1,  *"<12* 

"In  HnsvU  cl j*.cs  4.1  10  through  41  IS.  71  cases  tested  al  Psychological  Research  Unit  No. 
t.  US  at  Psychological  Research  Unit  No.  2  «nd  37  at  Psychological  Research  Unit  No.  J. 

"  Product -moment  correlations. 

"  Tested  at  Psychological  Research  Unita  Nos.  I.  2,  and  3.  Composite  grades  available  only 
for  graduates.  ......  .  .  ,  . 

"In  class  41-45.  Tested  at  Psychological  Research  Unit  Nos.  1,  2,  and  J. 

"  This  criterion  was  extremely  unreliable. 
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ard  deviation  of  0.12,  based  on  the  highest  27%  and  the  lowest  27%  of 
117  unclassified  aviation  students,  tested  in  August  1942  at  Psychological 
Research  Unit  No.  3. 

(3)  Reliability  coefficient. — An  odd-even  reliability  coefficient  of  0.76, 
corrected,  was  obtained  from  a  sample  of  480  unclassified  aviation  stu- 
dents,  tested  at  Psychological  Pesearch  Unit  No.  3. 

(4)  Difficulty. — The  difficulty  level  of  items  in  the  test  is  indicated  by 
the  mean  proportion  of  correct  responses  equal  to  0.40,  corrected  for 
chance  success.  The  proportions  ranged  from  0.00  to  0.84  with  a  stand¬ 
ard  deviation  of  0.21.  These  data  were  based  upon  results  from  117  un¬ 
classified  aviation  students,  tested  in  August  1942  at  Psychological  Re¬ 
search  Unit  No.  3. 

(5)  Factorial  composition. — Reading  Comprehension,  C1614G,  was 
facto,  analyzed  with  four  different  batteries.  It  helped  to  define  the  verba! 
factor  in  each  battery,  with  loadings  of  0.54,  0.69,  0.65,  and  0.65,  and 
a  weighted  average  of  0.60.  There  were  significant  loadings  in  the  me¬ 
chanical  experience  factor  for^tjie  three  analyses  in  which  this  factor 
appeared,  the  weighted-average  loading  being  0.37.  The  third  highest 
loading  for  this  test  appeared  in  the  general  reasoning  factor  in  ail  four 
analyses,  the  weighted  average  being  0.27.  The  loadings  for  the  numerical 
factor  in  CI614G,  as  found  in  the  four  analyses,  were  0.09,  —0.02, 0.11, 
and  0.15  with  a  weighted  average  of  0.12,  which  contributed  slightly  to  its 
validity  for  navigator  selection,  as  did  its  reasoning  variance.  It  will  he 
seen  in  the  discussion  of  the  next  form  of  this  test  (CI614H)  how  the 
numerical  loading  increased  after  an  attempt  was  made  to  increase  the 
navigator  validity  of  the  test. 

(6)  Test  validity. — For  validity  data  for  various  types  of  training,  see 
table  5.6.  For  air-crew  selection,  this  test  has  most  validity  for  the  navi¬ 
gator,  next  for  the  pilot  at  all  levels  of  training,  and  lowest  for  bombar¬ 
dier.  It  has  substantial  promise  for  selection  of  radio  operators  and  air 
mechanics. 

(7)  Item  validity. — Items  in  this  test  were  correlated  with  navigator 
training  criteria  (both  preflight  and  advanced)  and  a  bombardier-training 
criterion  (preflight  grades).  The  results  arc  shown  in  table  5.7. 


Table  5.7.—  Item  Validity  Data  for  Reading  Comprehension,  C16NG 


Croup 

Criterion 

*# 

K 

M  |>hi 

SD 

phi 

Range  of  phi 

Navigators'  . . . . ...... 

Bombardiers  in  preflight  training’ 
Navigators  in  preflight  training' 

Craduation -elimination 
Weighted  average  grade 
| Weighted  average  grade 

8<0 

190 

J<W 

.12 

0.04 

.09 

.n 

0.06 

.10 

.11 

-0.07  to  0.14 
-  .it  to  .24 
.01  to  .47 

»  Tested  in  June  1941  at  ISycbological  Research  Unit  No.  J, 


Evaluation. — In  CI614G,  the  seventh  form  of  this  test,  a  highly  re¬ 
fined  test  of  reading  comprehension  had  been  developed  which  could  be 
considered  an  adequate  measure  of  the  ability  to  comprehend  technical 
material.  The  hypothesis  for  the  development  of  tests  of  verbal  ability 


ar 

M- 


•  <  ■  -Wk  *  v-^- 


had  held  that  such  tests  would  lx:  valid  for  all  air-crew-officcr  positions 
I  ho  validity  statistics  cited  above,  however,  indicate  wide  difference  in  the 
amount  of  v?  iduy  tor  pilot,  bombardier,  and  navigator  training  success. 
It  was  evident  tnat  tile  greatest  contribution  that  probably  could  be  made 
by  a  reading-comprehension  test  was  in  the  field  of  navigator  selection, 
f  urther  research  was  pointed  towards  an  attempt  to  increase  the  naviga¬ 
tor  validity  of  the  test. 

Factorial  results  show  that  this  is  not  by  any  means  a  pure  verbal  test. 
While  42r/c  of  its  entire  variance  is  verbal,  about  \lr/c  must  be  allotted  to 
mechanical  experience,  and  about  6r/o  to  general  reasoning.  Its  validity 
for  the  navigator  may  be  due  almost  as  much  to  its  numerical  component 
as  to  its  verbal.  The  same  may  be  said  for  its  small  bombardier  validity. 
The  reasoning  component  would  contribute  only  to  navigator  validity. 

Variations  of  the  test. — Reading  Comprehension,  CI614G,  was  pre¬ 
ceded  by  six  preliminary  forms.  Since  a  test  of  this  type  is  most  satisfac¬ 
tory  when  there  are  several  good  questions  for  each  paragraph  of  reading 
material,  more  time  than  usual  was  spent  on  trial  runs  and  item  analyses 
in  order  to  maximize  the  number  of  differentiating  items  for  the  para¬ 
graphs  selected.  Difficulty  arose  not  in  finding  good  items,  but  in  finding 
good  complete  sets  of  four  to  six  items.  The  questions  and  alternate 
choices  finally,  included  were  carefully  selected  on  the  basis  of  their 
effectiveness  for  the  aviation  student. 

Editorial  problems  were  ever  present  in  constructing  these  preliminary 
forms.  Paragraphs  hail  to  attain  a  certain  difficulty,  yet  they  had  to  pos¬ 
sess  a  certain  degree  of  clarity.  Furthermore,  it  was  desirable  that  they 
|  contain  as  much  information  as  necessary  for  drawing  direct  or  indirect 

inferences  in  answering  items.  In  this  respect,  the  revision  was  not  quite 
successful,  as  die  substantial  mechanical  variance  shows.  The  variance  in¬ 
dicates  that  even  though  it  was  believed  that  all  necessary  information  had 
been  provided,  examinees  still  profit  by  previously  gained  mechanical 
experience  in  responding  to  items.  The  restrictions  listed  in  this  para¬ 
graph  pose:;  literary  and  semantic  difficulties  that  were  not  easy  to 
overcome. 

Heading  Comprehension,  CI6JL4I1  * 

Form  II  of  Reading  Comprehension  was  specifically  designed  to  dis¬ 
criminate  between  successful  and  unsuccessful  navigators  and  is  therefore 
more  difficult  than  the  previous  forms  designed  to  rank  the  entire  avia¬ 
tion-student  population. 

Description. —  (1)  Internal  characteristics. — This  test  is  composed  of 
8  paragraphs  concerning  which  36  questions  are  asked.  The  items  were 
designed  to  test  ability  to  make  valid  inferences  from  the  reading  material 
as  well  as  ability  to  answer  more  direct  questions  about  the  content.  Short, 
succinct  paragraphs  were  chosen,  both  In-cause  they  lend  themselves  to 

*  |)n(lo|«tl  at  Psychological  Hr-fifch  fnil  No.  J.  Chief  contributor,:  l-l .  Lewii  C.  Cm- 
rxntrf,  Jr.,  Capl.  t'rrjnick  U.  Davit,  T/Sgt.  Paul  C.  Davit,  t-t.  William  M.  Wkftlrr,  aa4 
Cota  d  Wright. 
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inferential  items  better  than  longer,  more  explicit  passages,  and  because 
the  number  of  items  answered  in  the  same  length  of  time  could  t 
creased,  thus  increasing  reliability.  In  order  to  prevent  an  increase  in  the 
correlations  of  the  new  form  with  other  tests  having  heavy  weights  for 
pilot  or  navigator,  items  were  selected  on  a  basis  of  their  lack  of  mechan¬ 
ical  and  numerical  content  as  well  as  for  their  consistency  with  the  total 
test.  Thus,  an  item  that  had  a  high  internal-coiu,istency  phi  coefficient,  and 
that  was  also  correlated  with  the  total  score  on  one  of  the  mechanical  or 
numerical  tests,  was  not  so  acceptable  as  an  item  with  a  similar  phi  that 
was  not  correlated  with  scores  on  the  other  tests.  Paragraphs  were  taken 
from  a  wide  range  of  technical  material,  including  tests  on  navigation, 
physics,  map  reading,  astronomy,  and  airplane  instruments.  This  selection 
was  made  on  the  basis  not  only  of  greater  pertinence  to  the  type  of  read¬ 
ing  the  cadets  would  encounter  in  ground* school  courses,  but  also  of 
greater  face  validity.  Material  pertaining  to  all  types  of  air-crcw  opera¬ 
tion  was  included  to  make  it  appear  that  the  test  was  one  that  pitots  as 
well  as  others  should  take  seriously. 

A  typical  paragraph  and  the  three  questions  asked  .about  it  are  here 
reproduced  to  illustrate  the  inference-drawing  technique. 

Force  and  countcrforcc  are  equal  aiui  opposite.  A  force  is  always  accompanied  by 
a  counterforce.  The  force  on  any  one  body  is  always  exerted  by  some  other  body; 
this  other  body  itself  experiences  an  equal  and  opposite  force.  The  two  are  parts,  or 
different  aspects,  of  one  inseparable  whole. 

The  general  principle  most  justifiably  derived  front  this  i<aragmph  is  that : 

A.  The  resultant  of  two  forces  caruv.t  equal  one  opposing  force. 

B.  All  forces  in  the  universe  maintain  equilibrium. 

C.  Force  and  countcrforcc  differ  in  magnitude  rather  titan  direction  and  can 

thus  be  considered  as  parts  of  one  inseparable  wliole. 

D.  The  effect  of  a  numlicr  of  forces  and  countcrforccs  acting  on  an  object 

is  always  movement. 

E.  The  work  done  by  a  body  is  not  a  function  of  live  strength  of  the  forces 

and  counlerforces. 

In  the  case  of  pressing  one’s  hand  against  a  wall,  llic  statements  in  lire  paragraph: 

A.  Would  be  true,  depending  uj>on  the  amount  of  force  with  which  one  pushed. 

B.  Would  be  true  only  if  the  wall  were  rigid 

C.  Would  be  true  only  if  the  wall  moved  in  the  direction  of  the  force. 

D.  Would  be  true  only  if  something  were  pushing  against  the  wall  from  the 

other  side. 

E.  Apply  without  qualification. 

The  statements  in  the  paragraph  imply  that  a  strong  man  striking  a  much  weaker 
one  would  encounter  a  force : 

A.  Equal  to  his  own. 

B.  Less  titan  his  own 

C.  Greater  than  his  own. 

D.  That  would  vary  with  tl*  difference  in  strength  between  tlie  strong  and 

the  weak  man. 

E.  That  would  cause  cqu^l  movement  in  the  bodies  of  die  weaker  and  the 

stronger  nun. 
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(2)  Administration. — The  directions  for  this  sort  of  test  were  ex¬ 
tremely  simple,  since  the  task  was  obvious.  It  was  necessary  only  to 
caution  the  students  to  answer  the  questions  on  the  basis  of  information 
contained  in  the  paragraph.  They  were  permitted  to  reread  part  or  all  of 
the  paragraph  while  answering  the  questions.  The  time  allowed  to  com¬ 
plete  the  test  was  limited  to  30  minutes  so  that  about  one-third  of  the 
group  would  be  able  to  finish. 

(3)  Scoring. — The  standard  scoring  formula  for  five-choice  items, 
multiplied  by  a  factor  of  2,  was  used;  i.  c.,  2R — W/2. 

Statistical  results. — (l)  Distribution  statistics. — The  data  for  several 
samples  are  given  in  table  5.8. 


Tabu:  5.8.—  Distribution  Data  for  Reading  Comprehension,  CI614H 


Group 

N 

M 

SD 

Unclassified  aviation  students  (post-college)1  .. 

1.500 

20.8 

12.4 

CliitiM  pilota1 . . . 

1.676 

19.3 

11.3 

Weal  Point  class  1946,  classified  pilot*  ....... 

888 

33.3 

14.S 

1  Toted  Nov;mt*r  1941  at  Psychological  Research  Units  Nos.  1,  2,  and  1. 
*  In  class  44-1.  Tested  at  Psychological  Research  Units  No*.  1,  2,  and  3. 


Tabu  5.9. —  Reliability  coefficients  for  rending  Comprehension,  CI6I4H,  based  upon 
groups  of  unclassified  a  Mtion  students 


M 

Typ« 

r* 

'a 

i.OOV  . 

0.74 

0.8$ 

500*  . 

.52 

.68 

-  Tested  in  April  1944  at  Medical  and  Psychological  Examining  Unit  No.  7. 

*  Tested  at  Medical  and  Psychologies.  Examining  Unit  No.  10. 

*  Items  assigned  into  two  groups  judged  to  be  more  or  lest  equivalent. 


(2)  Internal  consistency. — The  internal  consistency  of  items  is  indi¬ 
cated  by  a  mean  phi  of  0.41,  with  a  range  from  0.13  to  0.63  and  a  stand- 
art!  deviation  of  0.10,  based  on  the  highest  27%  and  the  lowest  27%  of 
40u  unclassified  aviation  students,  tested  in  October  1943  at  Psychological 
Research  Unit  No.  3. 

(3)  Reliability  coefficient. — Reliability  estimates  arc  shown  in  table  5.9. 

(4)  Difficulty. — The  difficulty  level  of  items  is  indicated  by  the  mean 
proportion  of  correct  responses  equal  to  0.32,  corrected  for  chance  suc¬ 
cess.  The  proportions  ranged  from  0.01  to  0.66  with  a  standard  deviation 
of  0.16.  The  sample  consisted  of  the  400  unclassified  aviation  students 
tested  in  October  1943  at  Psychological  Research  Unit  No.  3. 

(5)  Factorial  composition. — Reading  Comprehension,  C1614H,  was 

factor  analyzed  with  two  different  batteries.  Contrary  to  expectations,  the 
loading  in  the  verbal  factor  did  not  increase  over  that  in  the  G  form  of 
the  lest.  The  verbal  loadings  for  the  II  form  were  0.58  and  0.59  as  com¬ 
pared  to  :«  '-filed  average  loading  of  0.60  for  the  G  form.  The  loading 
in  the  i<  ’  experience  factor  did  decrease  as  had  been  desired, 

although  die  amount  is  uncertain.  The  loadings  were  0.33  and  0.04  in  two 
analyses  as  compared  to  a  weighted  average  of  0.37  for  the  three  G-form 
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analyses.  It  had  been  desired  that  the  numerical  content  of  form  H  would 
not  be  greater  than  that  of  form  G,  to  prevent  an  increase  in  the  correla¬ 
tions  of  the  new  form  with  other  tests  heavily  weighted  for  navigator.  An 
attempt  had  been  made,  as  mentioned  previously,  to  select  items  partially 
on  the  basis  of  their  lack  of  mechanical  and  numerical  content  as  well  as 
their  consistency  with  the  total  test.  It  was  not  possible,  however,  in  seek¬ 
ing  complex  reading  material,  to  find  paragraphs  entirely  free  of  mechan¬ 
ical  or  numerical  content.  This  fact  was  emphasized  by  the  characteristic 
loading  of  0.14  in  the  numerical  factor. 

(6)  Test  validity. — There  are  validity  data  for  air  mechanics  as  well 
as  for  pilot  training,  but  unfortunately  none  for  the  navigator  (see 
table  5.10). 


Table  S.10. —  Validity  Data  for  Reading  Comprehension,  CI614H 


Group 

Criterion 

^  9 

M. 

SD, 

ru. 

Pilots  in  primary 
training* 

Graduation-elimination  , . 

1,676 

0.89 

19.46 

18.24 

11.29 

0.06 

*0.14 

Pilots  in  primary 
training* 

Graduation-elimination  . . 

.1.145 

.84 

21.01 

18.60 

12.47 

.11 

*.10 

Air  mechanics  in 
training* 

Final  average  grade . 

2S4 

«  •  *  a  a 

•20 

a  •  a  a  a 

Air  mechanics  in 
training* 

Final  average  grade . 

428 

o  •  a  • 

8.22 

•20 

Armament 
trainees*  . 

Average  armament  grades 

269 

•  *  a  •  ' 

l 

10.65 

*.22 

'In  class  44J,  tested  at  Psychological  Research  Unit*  No*.  1,  2,  and  S. 

1  Assuming  an  unrestricted  augmented  alanine  standard  deviation  o (  1.91. 

’In  class  441.  tested  at  Psychological  Research  I'nita  Nos.  I,  2,  and  L 
'  Assuming  an  unrestricted  staninc  standard  deviation  of  2.04. 

*  Tested  with  the  November  1941  Classification  Battery  al  Medical  and  Psychological  t — » as¬ 
inine  Unit  No.  6. 

*  Product -moment  correlation. 

1  In  Lowry  Field  armament  classes  14-44A  and  JS-44A. 


Evaluation. — Examination  of  validity  data  obtained  on  Reading  Com¬ 
prehension,  CI614G,  indicated  that  an  attempt  should  be  made  in  the  next 
form  (CI614H)  to  increase  the  validity  of  the  test  for  navigator  success. 
Form  H,  then,  was  designed  specifically  to  separate  good  from  poor  and 
mediocre  navigator  material  without  regard  for  discrimination  at  the 
lower  levels,  but  with  regard  for  ranking  the  more  apt  candidates.  The 
test,  therefore,  is  more  difficult  than  the  previous  forms  designed  to  rank 
the  entire  aviation-student  population.  It  was  planned  to  revise  Form  G, 
increasing  the  verbal  factor  content,  decreasing  the  mechanical  experi¬ 
ence  content,  and  at  least  holding  constant  the  correlation  of  reading 
comprehension  with  numerical  tests.  As  pointed  out  in  the  discussion  of 
factor  content,  the  loading  in  the  verbal  factor  did  not  increase  in  form 
H.  The  loading  in  the  mechanical  experience  factor  did  decrease,  although 
certainly  not  as  much  as  had  been  hoped,  and  the  usual  loading  appeared 
in  the  numerical  factor.  No  validity  statistics  have  been  computed  for 
navigators  on  Form  H,  but  the  factor  results  indicate  that  the  goals  an¬ 
ticipated  in  the^I614H  revision  were  not  realized.  A  further  revision 
is  therefore  in  order. 

Variations  of  the  test. — This  form  was  preceded  by  five  preliminary 
forms,  C1614HXI,  HX2,  HX3,  11X4,  and  11X5.  Form  HX1  contained 
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38  items  based  on  4  paragraphs,  on  the  subjects  of  radio  beams,  the  func¬ 
tion  of  an  air-speed  meter,  parallax,  and  the  magnetic  pole  in  the  Northern 
Hemisphere  and  its  properties.  F< nn  11X2  contained  38  items  based  on 
6  paragraphs,  all  different  from  HX1.  These  paragraphs  discussed  (1) 
Bridgman’s  work  in  modern  physics,  (2)  interaction  of  waves  nd  mass, 
(3)  force  and  counterforce,  (4)  navigational  bearing,  (5)  radio  bearing, 
and  (6)  compensation  of  a  compass.  Kxcept  for  the  last,  these  paragraphs 
were  considerably  shorter  than  tho^e  in  the  first  preliminary  form.  The 
next  revision  (1 1X3)  incorporated  three  paragraphs  from  1 1 X 1  and  one 
form  11X2,  with  28  items.  Form  11X4  clarified  an  1  shortened  the  para¬ 
graphs  on  Bridgman's  contribution  to  modern  physics,  force  and  counter- 
force,  and  compass  compensation.  To  these  were  added  paragraphs  on 
sound  waves  and  spectrum  colors.  Twenty-eight  items  were  based  on 
these  five  paragraphs. 

The  final  experimental  form,  1 1X5,  was  lengthened  to  include  9  para¬ 
graphs  and  50  items.  The  testing  was  increased  to  50  minutes,  with  a 
provision  for  extra  time,  if  necessary,  to  allow  75  to  80%  of  the  students 
to  finish.  The  paragraphs  as  they  appeared  in  this  form  were  rewritten, 
where  this  was  deemed  desirable,  on  the  basis  of  internal-consistency-item 
analysis  of  the  previous  forms.  In  the  classification  battery  form, 
CI604H,  the  test  was  cut  to  36  items  and  8  paragraphs.  Those  included 
were  on  (1)  the  magnetic  North  Pole,  (2)  force  and  countcrforcc,  (3) 
spectrum  colors,  (4)  compass  compensation,  (5)  Mercator  projection, 
(6)  air-speed  meter,  (7)  bearing,  and  (8)  Bridgman’s  work  in  physics. 

After  Reading  Comprehension,  CI614H,  was  placed  in  the  classifica¬ 
tion  battery,  an  alternate  form,  CI614JX1.  was  constructed.  The  typo  of 
reading  material  included  in  the  (xiragraphs  is  similar  to  Form  H,  al¬ 
though  the  actual  subject  matter  of  all  the  paragraphs  i;i  JX1  is  different. 
J.X1  contains  42  items  based  on  7  paragraphs.  Forty  minutes  are  allowed 
for  completion  of  the  test.  The  paragraphs  include  discussions  of  (1) 
radio  television,  (2)  altitude  'tolerance,  (3)  supersonics,  (4)  hydraulic 
systems,  (5)  atmospheric  pressure,  (6)  refraction  of  light  rays,  and  (7) 
steam  turbines. 

Paychomotor  Instruction  Comprehension  Test,  CI626B  * 

This  test  was  designed  to  measure  the  comprehension  of  instructions 
given  for  psychomotor  classification- lottery  tests. 

Description — Tin:  test  is  administered  after  students  have  completed 
the  six  psychomolor  tests.  If  the  administration  takes  place  immediately 
following  the  last  psychomotor  test,  memory  factors  arc  minimized.  Idtis 
is  desirable,  since  the  lest  is  designed  primarily  as  a  measure  of  compre¬ 
hension.  Diagrams  of  each  psychomotor  te>t  are  presented,  and  the  ex¬ 
aminee  is  asked  questions  about  each  task. 

*  -loefi  •!  Jif!  I’.jiMmiol  Fvi'iti.ine  l ’nit  No  10  o  ft  -ontrit>ul»f :  C*|*- 

Jo>fpk  K_  Kin*.  An  e»rl«»»  ol  tk*  urn  l>|*  »»»  «tf >  ftopfJ  *1  Mill'  Nn.  *.  CVf f  cno- 
(ributnf:  S/S|t.  Aflknr  Z.  Ctrl. 
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(1)  Internal  characteristics. — The  test  contains  83  items,  6  of  which 
arc  on  the  instructions  given  preliminary  to  the  psychomotor  tests,  61  arc 
on  specific  tests,  5  are  on  conditions  for  making  good  scores,  and  the  re¬ 
maining  11  on  distinguishing  features  of  the  tests.  / 

(2)  Administration. — Answers  are  marked  directly  on  the  standard 
five-place  IBM  answer  sheet.  As  mentioned  previously,  it  is  imperative 
that  this  test  be  administered  immediately  alter  completion  of  the  psycho- 
motor  battery.  Two  sample  items  are  duplicated  below.  The  first  refers 
to  figure  5.1,  the  second  to  figure  5.2. 


FIGURE  5.  I 

SAMPLE  ITEM  OF  PSYCHOMOTOR  INSTRUCTION  COMPREHENSION, 

CI626B 

Your  task  in  this  lest  was  to: 

A.  I'oiiow  a  moving  target. 

B.  Keep  a  st)lus  level  «luri.  g  each  trial. 

C.  U'C  a  smooth,  free-swinging  m -lien  •  > f  the  arm  and  slvutdcr  while  follow¬ 

ing  the  moving  target. 

I).  Keep  the  end  of  the  sljKs  on  the  brass  target  as  it  ir»>ve<l. 

F-  Keep  the  stylus  one  inch  o'f  the  target  as  it  moved. 

(3)  Scoring. — The  store  on  this  test  is  simply  the  number  of  correct 
responses. 

Statistical  results.  {1)  Distribution  stoti..  <s. — ITtc  distribution  of 
scores  is  ind:,'ated  bv  .«  iman  store  of  58.90  and  a  standard  deviation  of 
7.4,  based  ujxm  a  simple  of  100  unclassified  aviation  slttdeii’s. 

(2)  Internal  consistence  -  The  internal  consistency  of  stems  is  indi¬ 
cated  by  a  mean  phi  of  0  26  with  a  range  from  0.00  to  0.59  and  a  stand- 
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ard  deviation  of  O.il,  based  on  the  highest  27%  and  the  lowest  27%  of 
400  unclassified  aviation  students. 


SAMPLE  ITEM  OF  PSYCHOMOTOR  INSTRUCTION  COMPREHENSION, 

CI626B 

If  Ibe  apparatus  showed  this  inltcm,  you  were  to  snap  the: 

A.  I«owcr  switch. 

B,  Upper  and  lower  switchr*. 

C  Right  switch. 

D.  Upper  switch. 

E.  Left  and  right  switches.  * 

(3)  Reliability  coefficient. — A  reliability  coefficient  of  0.75,  corrected, 
was  obtained  by  the  odd-even  method  on  a  sample  of  400  unclassified 
aviation  students. 

(4)  Difficulty. — The  dilliculty  level  of  items  in  this  test  is  indicated  by 
a  mean  proportion  of  correct  responses  equal  to  0.71,  corrected  for  chance 
success.  The  projxmions  ranger!  from  0.00  to  0.99  with  a  standard  devi¬ 
ation  of  0.22.  These  data  are  hasod  upon  results  from  400  unclassified 
aviation  students. 

Variations  of  the  test. — Psychomotor  Instruction  Comprehension  Test, 
C162()B,  was  preceded  by  a  preliminary  form,  C1626A.  This  contained 
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30  items  based  on  the  6  classification-battcry  psychomotor  tests.  Five 
items  were  based  on  the  orientation  talk,  an  introductory  speech  given  the 
examinees  before  testing  began.  The  chief  difference  between  the  pre¬ 
liminary  and  final  forms  of  this  test  is  that  in  the  earlier  variation, 
CI626A,  all  the  items  were  completely  verbal. 

Evaluation. — This  type  of  test  has  possibilities  as  a  measure  of  ability 
to  remember  instructions  and  should  be  studied  in  connection  with  mem¬ 
ory  tests,  particularly  the  test,  Memory  for  Tactical  Plans  (see  ch.  11). 

It  probably  has  nothing  to  offer  as  a  test  of  verbal  comprehension.  If  a 
memory  test  is  desired,  one  of  purer  composition  and  one  with  more  sat¬ 
isfactorily  controlled  administration  could  probably  be  designed. 

EVALUATIONS  AND  CONCLUSIONS 
Vocabulary  Testa 

Statistical  results  have  shown  considerable  differentiation  between  the 
two  types  of  tests  used  to  measure  verbal  intelligence,  as  far  as  their  fac¬ 
torial  composition  and  their  ability  to  predict  air-crew  success  is  con¬ 
cerned.  The  failure  of  part  of  the  hypothesis  that  verbal  ability  is  valid 
for  air-crcw  training  success  was  indicated  in  the  case  of  pilots  and  bom¬ 
bardiers  by  results  from  the  vocabulary  tests.  These  results  revealed  a 
slightly  negative  validity  for  pilots,  a  zero  validity  for  bombardiers,  and 
a  small  validity  for  navigators  (approximately  0.20).  Vocabulary  is  the 
best  measure  of  the  verbal  factor,  having  a  loading  of  0.71.  Although  this 
factor  has  sonic  validity  for  navigators,  reading  comprehension  tests  have 
a  greater  navigator  validity  than  the  vocabulary  tests  because  of  their  rea¬ 
soning  and  numerical  components. 

Owing  to  its  limited  predictive  value,  the  vocabulary  test  was  dropped 
from  the  classification  battery  in  the  summer  of  1942.  It  was  replaced  by 
the  Technical  Vocabulary  Test,  CE505C,  which  is  a  test  of  specific  tech¬ 
nical  information  pertaining  to  piloting,  navigation,  and  bombardtering. 
This  test,  although  related  to  vocabulary,  and  loaded  with  the  verbal 
factor  (0.41  for  the  pilot  score),  possessed  a  validity  of  0.21  for  pilots. 
Part  of  this  validity  is  derived  from  the  tea’s  loading  with  the  mechanical 
experience  factor  (0.39).  The  remaining  part  is  accounted  for  by  its 
loading  with  pilot  interest  (0.34). 

Heading  Comprehension  Tests 

The  Reading  Comprehension  test  has  proved  to  be  a  useful  classifica¬ 
tion  instrument,  particularly  for  predicting  navigator  success.  Mean  valid¬ 
ities  of  0.38  for  navigators,  0.13  for  bombardiers,  and  0.20  for  pilots 
have  been  obtained.  The  pilot  validity  comes  largely  from  a  loading  with 
the  mechanical  experience  factor  (0.37  for  form  G).  Although  the  ^er- 
bal  factor  is  valid  for  the  navigator,  it  is  not  the  most  important  ability 
involved  in  the  navigator  criterion.  Furthermore,  other  tests  in  the  clas¬ 
sification  battery  have  covered  this  factor  fairly  well  (General  Informa- 
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lion,  navigator  score,  with  a  verbal  loading  of  approximately  0.60,  and 
Technical  Y'ocabulary,  navigator  score,  with  a  verbal  loading  of  approxi¬ 
mately  0.75).  A  test  to  be  highly  valid  for  navigators  should  contain  a 
significant  loading  in  the  numerical  factor.  In  spite  of  the  navigator 
validity  of  Reading  Comprehension,  CI614G,  the  numerical  factor  loading 
of  the  test  is  low  (0.14). 

As  far  as  future  policy  is  concerned,  it  might  be  advisable  to  work 
toward  removal  of  reading  comprehension  from  the  classification  battery, 
in  spite  of  its  present  contribution  to  the  stanines.  The  reason  is  that 
reading  comprehension  is  factorially  complex.  It  has  effective  loadings 
in  four  factors — verbal,  mechanical  experience,  numerical,  and  general 
reasoning.  Reading  Comprehension’s  communality  is  high  (0.87),  which 
indicates  that  almost  all  of  the  validities  of  the  test  a:  derived  from  the 
four  identified  valid  factors.  The  four  factors  arc  adequately  covered  by 
other  tests  in  the  classification  battery,  so  that  Reading  Comprehension  is 
merely  a  duplicate  measure  of  them.  It  might  be  profitable,  therefore,  to 
attempt  to  increase  the  loading  in  each  of  these  factors  for  the  particular 
test  that  is  the  best  known  measure  of  the  factor.  If  this  were  done,  the 
classification  battery  would  contain  purer  measures  of  these  factors.  The 
function  of  Reading  Comprehension  would  then  be  more  than  adequately 
supplanted. 
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CHAPTER  HI. 


Mathematical  Tests1 


1  INTRODUCTION 

Prediction  of  Academic  Achievement 

Measures  of  mathematical  aptitude  or  achievement  have  long  been  used 
to  assist  in  the  prediction  of  success  not  only  in  more  advanced  mathe¬ 
matics  but  in  other  academic  pursuits  as  well.  The  use  of  these  measures 
to  predict  success  had  three  principal  bases.  First,  it  was  logical  to  sup¬ 
pose  that  performance  in  mathematics  would  be  positively  correlated  with 
performance  in  pursuits  having  mathematical  or  numerical  content.  Sec¬ 
ond,  mathematical  ability  was  generally  considered  to  be  one  of  the  best 
indices  of  abstract  intelligence.  Third,  it  was  widely  assumed  that  the 
exac‘  methods  and  critical  attitudes  demanded  in  mathematics  would  be 
ferried  over  into  the  performance  of  other  tasks.  With  respect  to  the 
third  point,  it  is  generally  agreed  that  much  less  transfer  of  training 
as  place  than  was  once  supposed.  Even  though  not  accepted  unreserv¬ 
edly  ihese  hypotheses  offered  a  basis  for  believing  that  mathematics  tests 
would  prove  useful  in  predicting  success  in  air-crew  training. 

History  of  Mathematics  in  Air-Crew  Selection 

The  early  recognition  of  the  importance  of  mathematics  is  attested  by 
the  place  assigned  to  it  in  examinations  administered  to  determine  quali¬ 
fications  for  air-crew  training.  As  noted  in  chapter  4,  examinations  in 
lieu  of  high-school  graduation  and  in  lieu  of  completion  of  2  years  of 
college  were  initiated  in  1920  and  1927  respectively.  Mathematics  was 
apparently  prominent  in  both  levels  of  examination.  The  standardized 
objective  examination  adopted  in  1941  gave  great  emphasis  to  mathe¬ 
matics,  four  of  the  five  required  sections  being  mathematics  (arithmetic, 
algebra,  geometry,  and  trigonometry).  In  like  manner,  the  first  form  of 
the  AAF  Qualifying  Examination  which  was  adopted  in  January  1942 
had  mathematics  as  an  important  constituent.  One  of  the  six  sections  w  as 
mathematical  and  contained  three  types  of  problems:  arithmetic  reason¬ 
ing,  numerical  operations,  and  mathematics  achievement.  Significantly, 
performance  on  this  section  proved  to  have  a  biserial  correlation  of  0.64 
with  graduation-elimination  from  navigation  school.  This  figure  was  based 
on  a  group  of  174  graduates  and  47  climinccs. 

With  the  establishment  of  bombardier  and  navigator  training  school* 
in  1940,  the  question  of  classification  tor  air  crew  training  arose.  This 
problem  was  at  first  solved  by  selecting  bombardiers  and  navigators  from 

*  Written  by  T/SkC  P.ut  C.  D»tU. 
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those  eliminated  from  pilot  training.  Faculty  boards  that  eliminated  pilots 
decided  the  type  of  training  to  which  such  cliininecs  should  be  sent.  Ap- 
predation  of  the  importance  of  mathematics  in  predicting  success  in  navi¬ 
gation  is  demonstrated  by  the  fact  that  the  faculty  boards  attempted  to 
send  to  navigator  training  men  who  were  trained  in  engineering  and 
mathematics.  Recognition  of  the  more  intellectual  or  academic  nature  of 
navigation  is  also  indicated  by  the  fact  that  cliininecs  for  failure  in 
ground  school  were  not  sent  to  navigation  training. 

Job  Analysis  Findings 

'Hie  early  job  investigations  described  in  chapter  1  revealed  important 
facts  concerning  job  requirements  and  yielded  leads  for  construction  of 
predictive  instruments.  According  to  these  analyses,  mathematics  had  an 
important  place  in  air  crew  tasks. 

The  Navigator. — Even  superficial  examination  of  the  duties  involved 
in  navigation  reveals  that  relatively  high  degrees  of  skill  in  making  com¬ 
putations  ami  in  interpreting  data  are  required.  In  fact,  early  job  analyses 
of  the  duties  of  navigators  disclosed  that  numerical  and  mathematical 
abilities  were  probably  the  most  important  factors  influencing  success  in 
navigation. 

Among  the  many  duties  of  the  navigator  that  demand  mathematical 
knowledge  and  skill,  typical  examples  may  be  cited.  The  navigator  must 
calculate  drift,  distance,  and  direction  from  data  gathered  from  various 
instruments.  He  also  uses  computers,  such  as  the  E-6B,*  accurate  use  of 
which  requires  considerable  skill  and  mathematical  knowledge.  The  navi¬ 
gator  also  uses  many  mathematical  tables,  both  in  making  calculations 
(such  as  tables  of  squares  and  roots)  and  in  looking  up  pertinent  data. 
In  addition  to  these  specific  abilities,  the  navigator  must  have  a  keen  sense 
of  the  interrelationships  of  the  facts  which  he  has  gathered  and  must  be 
able  to  integrate  knowledge  of  these  facts  into  a  clear  and  accurate  picture 
which  will  enable  him  to  make  valid  navigational  decisions. 

In  one  early  survey,  an  analysis  was  made  of  reasons  for  failure  of 
navigator  trainees,  as  reported  by  members  of  a  navigation  school  staff. 
A  total  of  56  responses  was  categorized  according  to  cause  of  failure  (re¬ 
lated  to  intellectual  as  distinguished  from  physical  and  emotional  causes). 
Of  these,  37  indicated,  either  directly  or  indirectly,  a  lack  of  speed  or 
ability  in  numerical  or  mathematical  tasks. 

Analyses  also  indicated  that,  apart  from  the  distinctly  mathematical 
phases  of  the  navigation  job,  the  general  content  of  the  task  is  much  more 
academic  than  the  jobs  of  cither  pilot  or  bombardier.  The  high  correlation 
of  mathematical  ability  with  general  academic  achievement  lent  support 

» TV  Ml  ttnrvltf  i<  i  circular  >li4(  ink  and  a  drrk»  (w  aolrtng  vector  frallnu  t» 
countered  ia  dead  reckoning.  TV  clide-rule  fore  may  V  uvrd  Car  voicing  trtMtiu  involving 
lime.  ipeed.  (Iterance,  rmilnj.|teation.  dm  Mar.,  proportion,,  true  air  apeed  {(raw  calibrated  air 
apeed),  and  true  ihitvlr  (fiom  calibrated  altitude).  TV  (ace  »«ed  in  volvtrtg  vector  problem* 
bat  a  tran.parrni  plotline  dt*k  with  a  graduated  compatt  roar  vrbicb  can  be  rotated  wrtb  tbe 
fcnrera.  A  »l>dr  under  the  ditk  i«  marked  vritb  concentric  .peril  circles  radiating  drift  Kara 
and  a  rectangular  trad.  Tbe  tlide  i*  veeu  through  tbe  plotting  drab  and  may  be  moved  both  and 
forth  under  tbe  duk  at  detired.  Tbe  platting  ;a  done  in  pencil  on  the  transparent  diak. 
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to  the  hypothesis  that  mathematical  ability  would  be  a  good  predictor  of 
success  in  navigation. 

The  pilot. — In  general,  there  was  less  reason  to  expect  high  correlation 
between  mathematical  ability  and  success  in  pilot  training  than  in  naviga¬ 
tion  training,  although  early  concepts  of  the  requirements  of  the  pilot's 
job  pictured  it  as  a  fairly  intellectual  task.  The  qualifying  requirement  of 
2  years  of  college  training  or  its  equivalent  during  the  years  1927  to  1942 
indicates  the  emphasis  placed  upon  academic  aptitude  and  achievement. 
If  such  standards  be  valid,  mathematical  ability  could  reasonably  be  ex¬ 
pected  to  be  positively  correlated  with  success  in  pilot  training. 

Early  pilot-job  descriptions  mention  table  reading,  use  of  computers, 
and  simple  mathematical  calculations.  These  functions  arc  apparently  in¬ 
cidental,  however,  and  constitute  a  relatively  small  proportion  of  the 
pilot's  job.  It  might  be  assumed  justifiably,  therefore,  that  any  candidates 
who  passed  the  preliminary  hurdles  to  pilot  training  would  be  able  to 
perform  these  mathematical  tasks  satisfactorily. 

The  bombardier. — The  crucial  test  of  the  bombardier’s  proficiency 
comes  during  the  "bomb  run."  The  entire  success  of  the  bombing  mission 
depends  upon  the  speed  and  accuracy  of  his  performance  during  the  few 
seconds  preceding  the  bomb  release.  In  order  to  set  proper  data  into  the 
bomb  sight,  the  bombardier  must  read  several  tables  correctly,  use  a  com¬ 
puter  accurately,  and  make  relatively  simple  calculations,  e.  g.,  interpola¬ 
tion.  One  of  the  most  important  and  exacting  of  these  specific  duties  is 
determining  true  altitude  on  the  basis  of  temperature  readings  and  other 
pertinent  data.  This  computation  is  of  special  importance,  because  error 
will  inevitably  result  in  a  short  or  over  bomb  drop  unless  there  happen 
to  be  compensating  errors. 

Compared  with  the  navigator,  the  bombardier  has  fewer  mathematical 
data  to  integrate.  The  most  important  mathematical  requirement  of  the 
bombardier  is  that  he  perform  the  necessary  calculations  speedily  and  ac¬ 
curately,  since  the  time  clement  in  the  bombing  run  makes  decisions  based 
on  these  calculations  practically  irrevocable. 

Summary. — Job-analysis  findings  indicate  that  mathematics  is  ex¬ 
tremely  important  to  navigation  and  that  measures  of  proficiency  or 
achievement  in  mathematics  should  be  good  predictors  of  navigational 
success.  To  a  much  less  degree,  it  appeared  that  mathematical  ability  af¬ 
fects  success  in  bomba rdie ring.  For  the  pilot  it  appeared  that  little  rela¬ 
tionship  exists  between  success  and  mathematical  ability.  In  this  chapter 
two  types  of  tests  will  be  discussed — general  mathematics  and  numerical 
computations.  Arithmetical-reasoning  tests,  although  mathematical,  are 
primarily  reasoning  tests  and  so  will  be  discussed  in  chapter  7. 

GENERAL  MATHEMATICS  TESTS 
Mathematics  A,  CI702E  1 

This  form  is  typical  of  those  tests  devised  to  measure  ability  and 

*  Dmhyfd  u  Ptjr(b*Wck»l  Rnnrtt  Unrt  SV  J.  CbW  t»«lriWf  «i  Cm  U»|t  G. 

Hmpbrrr*.  M*J.  Wrmi  F.  M. 
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achievement  in  advanced  arithmetic,  algebra,  and  trigonometry. 

Description. — This  test  was  designed  to  measure  competence  in  mathe¬ 
matics.  In  general,  a  student  who  has  completed  high-school  algebra 
should  be  equipped  to  solve  most  of  the  problems. 

(1)  Internal  characteristics. — The  test  consists  of  30  five-alternative, 
multiple-choice  items.  The  last  three  items  require  knowledge  of  trigo¬ 
nometry.  The  following  problems  are  typical  of  the  first  27  items: 

(3*— 1)  (2*+2)  = 

A.  6x*—x+2 

B.  6*»+4*-2 

C.  &r*— 4*— 

D.  3**-*+3 

E.  x*+2x-6 

R  =  c*d*.  If  c  —  2  and  d  =  —3,  then  R  =* 

A.  -108 

B.  -72 

C  36 

D.  72 

E.  108 

If  5  =  3MR\  then  M  = 

A.  3K*S 


C 


3 S 

R* 


E. 


3K* 

S 


(2)  Administration. — The  examinees  arc  urged  not  to  spend  an  undue 
amount  of  time  on  problems  they  find  difficult.  Scratch  paper  is  furnished 
for  any  necessary  written  computations. 

(3)  Scoring. — The  test  was  scored  first  with  the  formula  R — IV/4  and 
later  with  the  formula  2 R — W f2.  The  change  was  made  in  order  to  ob¬ 
tain  scores  of  a  magnitude  which  better  fitted  the  system  of  weighting  in 
computing  the  composite  classification  score. 

Statistical  Results. — Quite  complete  statistical  data  were  obtained,  since 
this  test  was  included  in  the  classification  battery  for  some  time. 


7A 


i 

£ 

* 


\ 


■ 

* 


7S 


A*- 


l  Ite’WW  -to*"" 


1 


I 

I 


Mki 


i  . — i. 


( 1 )  Distribution  statistics. — Based  on  a  sample  of  9,622  unclassified 
aviation  students  (tested  at  Psychological  Research  Units  Nos.  1,  2,  and 
3  with  the  June  1942  Classification  Battery),  the  test  yielded  a  mean 
score  ( 2R — W /2)  of  19.4  and  a  standard  deviation  of  14.1.  The  distri¬ 
bution  curve  for  new  aviation  students,  such  as  the  sample  cited  above, 
was  positively  skewed  and  markedly  flatter  than  normal. 

(2)  Internal  consistency. — Internal-consistency  item  analysis  of  this 
test  revealed  a  marked  degree  of  homogeneity.  Internal-consistency  phi 
values  based  on  administration  to  400  unclassified  aviation  students  ranged 
from  0.10  to  0.84,  with  a  mean  of  0.55  and  a  standard  deviation  of  0.19. 

(3)  Reliability  coefficients. — Reliability  of  the  test,  estimated  by  the 
odd-even  method,  was  0.92  (corrected  for  length),  based  on  200  cases 
tested  at  Psychological  Research  Unit  No.  2  in  April  1942. 

(4)  Difficulty. — Based  upon  the  item  analysis  previously  referred  to, 
the  test  yielded  a  mean  difficulty  index,  corrected  for  chance,  of  0.48  with 
a  standard  deviation  of  0.18.  Testing  of  a  group  of  students  who  had  been 
selected  for  superior  performance  in  examinations  taken  prior  to  entering 
college  training  detachments  (upper  20  percent  based  upon  composite 
of  achievement-test  scores  in  geography,  history,  mathematics,  physics, 
and  reading  comprehension)  yielded  a  mean  difficulty  index,  corrected 
for  chance,  of  0.68  and  a  standard  deviation  of  0.15. 

(5)  Factorial  composition.— This  form  proved  to  have  significant  load¬ 
ings  with  three  factors  only.  The  verbal  factor  had  a  loading  of  0.53,  the 
numerical  factor  0.42,  and  the  visualization  factor  0.33,  in  an  analysis  in 
which  a  general-reasoning  factor  also  appeared,  but  with  a  loading  of 
only  0.12. 

(6)  Test  validity. — Validation  data  were  secured  against  all  air-crew 
and  some  technical-specialty  criteria.  The  data  are  shown  in  tables  6.1  to 
6.4  inclusive. 


in  training 


a>u 


41-$  l«  41-7* 
41-5  to  41-7* 
41-$  to  4$-7* 
43-8  to  41-11* 
41-14  to  41-18* 
41-14  to  41-11* 
41-1  to  41-4 
41-1  !•  41-4 

41-1  to  41-4 


Rceirck 

unit 


1 

3 

1 

I.  3.  J 
1.  2.  1 
I.  2,  1 
1 
1 
1 


Criterion 

M. 

SD, 

r»ll 

•rai»* 

Graduation- 

elimination 

552 

0.14 

ill 

11.7 

11.2 

0.08 

0  0.0 

Graduation* 

elimination 

129 

.88 

22.0 

20.9 

14.4 

.04 

0.00 

Graduation- 

elimination 

*69 

.12 

20.1 

18.9 

12.8 

.08 

0  0.0 

Graduation- 

elimination 

1.129 

.79 

18.1 

15.8 

12.4 

.12 

0.14 

Graduation- 

elimination 

456 

.14 

21.1 

15.5 

18.2 

.20 

.21 

Graduation- 

elimination 

524 

.86 

22.1 

18.0 

11.0 

.18 

.21 

Average  grade* 

191 

.... 

•  .  •  . 

.... 

.... 

*22 

4>  O  O  O 

Record  circular 
error* 

1*5 

.... 

•-.09 

•  ••♦ 

Comkat  circular 
error* 

195 

•  o  •  • 

*  •  V  . 

•  a.  . 

*08 

•  «  »  O 

•  Attumed  unrcetrictrd  turning  »lindard  deviation  nto  reported. 

i  \(>  (nation  cadet t.  taking  1 2-week  tetirii  («*  training). 

•  New  aviation  cadet*.  til  «  l»  -eek  cwurae  (wilk  navigation  training). 

*  RkIiuiM  taking  11-week  cwnt 

*  Product -moment  correlation*. 

•A  kigMy  unrHUkla  criterion. 
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Taelk  6,4. — Validity  data  for  Mathematics,  CI702E,  for  miscellaneous  specialties 


Group 

Criterion 

N. 

P. 

M. 

rn# 

Fadio  operator  mechanic* 
Air  mechanic  armorer*1  . . 
Air  mechanic  armorer*  . . 

Flexible  gunner*1 . 

Flexible  gunner*' . 

Graduation-elimination 

Average  grade* . . 

Average  grade* . . 

Air-to-air  firing  . 

Final  examination  .... 

235 

232 

376 

194 

194 

0.65 

20.16 

15.38 

19.23  0.19 

14.74  *.21 

13.18  '.03 

13.3  '.11 
13.3  >06 

....  ..V  *  >y<-nuiogicai  nccjicn  unit  i\o.  i.  t-ntcred  tra 

I94J  at  AAF  Technical  School,  Sheppard  Field. 

'  Product-moment  correlation*, 
la  da**c*  tHI  and  43-4R,  at  Buckingham  Army  Air  Field. 


(7)  Item  validity. — Items  of  this  test  were  validated  against  the  pass- 
fail  navigator  criterion.  The  mean  validity  thus  obtained  was  0.09,  the 
standard  deviation  was  0.09,  and  the  range  was  from  —0.04  to  +0.33. 
Ten  of  the  30  items  yielded  validities  of  less  than  0.05,  which  indicates 
that  careful  revision  and  selection  of  items  might  increase  the  over-all  test 
validity. 

Evaluation. — This  test  proved  to  be  a  relatively  good  predictor  of  suc¬ 
cess  in  navigation  training.  For  other  air-crew  tasks  and  technical  spe¬ 
cialties  which  are  not  highly  loaded  intellectually,  the  test  proved  a  much 
less  satisfactory  predictor. 

Variations. — In  developing  and  refining  a  general  mathematics  test, 
several  successive  forms  were  prepared  and  administered. 

(1)  General  Mathematics — Form  A*. — This  is  the  first  form  of  gen¬ 
eral  mathematics  developed  for  use  in  classification.  It  contains  75  items, 
which  arc  arranged  in  three  parts  of  25  items  each.  The  parts  arc  timed 
separately,  15  minutes  for  each  part.  The  test  contains  arithmetic-reason¬ 
ing  items  as  well  as  mathematics  items  similar  to  those  described  under 
form  C1702E.  This  form  was  used  in  the  classification  of  air-crew  candi¬ 
dates  for  a  short  time  and  was  validated  against  the  navigator  pass-fail 
criterion  in  training.  For  a  sample  of  478  cases  (tested  at  Psychological 
Research  Unit  No.  2),  the  test  yielded  a  biserial  correlation  of  0.51  with 
navigator  success.  Although  the  test  is  too  long  and  difficult,  the  results 
provided  considerable  impetus  for  further  exploration  of  the  usefulness 
of  mathematics  tests. 

(2)  General  Mathematics  Te$tt  Form  II  (CI702B)1. — This  form  is 
a  revision  of  Form  -\  made  easier  and  shortened  to  60  items.  The  same 
amount  of  time  is  allowed  as  for  Form  A.  The  same  categories  as  those 
in  Form  A  are  retained,  but  some  slight  changes  in  the  numbers  of  items 
in  the  various  categories  were  made.  The  categories  and  their  contents 
are:  Algebraic-equations  and  formulas,  21  items;  arithmetic,  16  items; 
plane  geometry,  8  items;  trigonometry,  5  items;  algebra,  4  items;  analytic 
geometry,  4  items;  spherical  geometry,  1  Item;  and  solid  geometry,  1  item. 

The  difficulty  level  of  this  form  of  general  mathematics  is  much  more 
appropriate  than  that  of  the  previous  form.  Analysis  revealed  that  some 

*  Developed  «t  XiKvck  Unit  No.  I.  Chief  contributor:  U  Cot  Lanroneo  F. 

■OtnUH  Pirk*li(i{tl  Xftnrck  Unit  Nt.  )  k;  lk<  teat-eonttnacUMi  tuft 
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items  were  still  extremely  easy  and  some  extremely  difficult,  while  some 
had  low  internal  consistency.  This  form  was  used  for  a  short  time  only 
in  classification. 

(3)  General  Mathematics  Test ,  Form  111  (C1702C)*, — This  form 

also  contains  three  parts  of  20  items  each  and  is  administered  with  a  total 
time-limit  of  45  minutes.  It  is  a  revision  of  form  II  in  which  the  ex¬ 
tremely  hard  items,  the  extremely  easy  items,  and  those  with  low  internal 
consistency  were  cither  revised  or  replaced.  Analysis  was  also  made  of 
the  items  in  terms  of  the  place  where  such  material  was  covered  in  the 
navigation  ground  school.  This  form  is  less  homogeneous  (mean  internal- 
consistency  phi  =  0.39)  and  somewhat  more  difficult  (mean  difficulty  in¬ 
dex,  corrected  for  chance  =  0.40)  than  form  C1702F.,  previously 
described. 


(4)  Mathematics  A  (C1702F)’. — This  form  is  a  revision  of  the 
C1702E  form  and  contains  35  items,  which  is  5  more  than  in  form  E.  The 
time  allowed  for  the  test  is  25  minutes.  This  fonn  proved,  as  expected,  to 
be  more  difficult  than  the  E  form.  The  mean  difficulty  index  for  students 
who  had  just  taken  mathematics  in  the  college  training  detachment  is  0.46, 
as  compared  with  a  mean  of  0.68  in  Form  K  for  a  similar  group.  This 
form  was  in  the  classification  battery  for  more  than  a  year,  as  a  navigator- 
selection  instrument. 

Factor  analysis  of  this  form  of  the  test  revealed  considerable  difference 
from  form  E,  although  the  contents  of  the  two  tests  are  superficially  very 
similar.  Four  factors  have  loadings  al>ovc  020  based  on  a  weighted  aver¬ 
age  of  two  analyses.  In  order  of  importance,  ttu.se  factors  are:  numerical 
(0.51),  verbal  (0.37),  mathematics  background  (0.37),  and  general 
reasoning  (0.24).  It  appears  that  the  higher  verbal  landing  of  the  E  form 
may  be  explained  by  the  fact  that  neither  the  mathematics-background 
factor  nor  the  general-reasoning  factor  was  isolated  in  the  battery  in 
which  the  E  form  appeared.  This  explanation,  if  correct,  accounts  for  a 
large  part  of  the  apparent  factorial  difference  between  the  two  forms  of 
the  test. 

(5)  Mathematics  A  (CI702GSI)1.— This  form  is  a  revision  of 
CI702F,  containing  57  items.  Prime  objectives  were  (1)  to  avoid  all 
arithmetic-reasoning  content,  (2)  to  include  more  items  in  higher  mathe¬ 
matics  and  thereby  broaden  the  base  of  the  test,  (3)  to  increase  the  diffi¬ 
culty  in  order  to  discriminate  better  among  the  more  capable,  and  (4)  to 
include  items  that  would  be  most  valid  for  navigator  selection.  Internal- 
consistency  item  analysis  against  total  scores  on  Mathematics  A,  CI702F, 
yielded  a  mean  phi  value  of  0.43.  Although  this  figure  is  lower  than  the 
mean  phis  for  previous  forms,  35  items  have  phi  values  of  0.43  and 
above  and  yield  a  mean  of  0.50.  Owing  to  the  fact  that  the  use  of  general 
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mathematics  in  the  classification  battery  was  discontinued,  no  further  us»* 
was  made  of  this  form  of  the  test. 

COMPUTATION  TESTS 
Numerical  Operations,  CI701B* 

This  test  was  constructed  to  satisfy  the  need  for  measurement  of  per¬ 
formance  in  the  simple  arithmetical  processes.  Presumably,  proficiency  in 
these  simple  operations  should  have  important  bearing  upon  success  in 
other  tasks  where  such  operations  arc  involved.  Repeated  observations  of 
the  tasks  of  bombardier  and  navigator,  the  former  in  particular,  led  to  a 
growing  conviction  that  the  mathematics  most  significant  is  computa¬ 
tional.  This  is  in  view  of  the  liberal  aid  supplied  to  the  students  in  the 
form  of  tables  and  other  accessories  and  of  rule-of-thumb  methods  taught 
in  ground  schools. 

Description.  (1)  Internal  characteristics. — This  test  involves  only  the 
four  fundamental  arithmetical  oj>crations.  The  problems  are  printed  on  an 
expendable  IBM  answer  sheet.  The  front  of  the  sheet  contains  100  addi¬ 
tion  and  multiplication  problems  to  which  answers  are  given.  Each  answer 
is  followed  by  two  spaces  for  marking.  If  the  answer  is  correct,  the  C 
space  is  to  be  blackened;  and  if  the  answer  is  wrong,  the  “W”  space  is 
to  be  blackened.  The  use  of  response  “R”  for  right  was  avoided  to  pre¬ 
vent  confusion  with  the  other  common  opposition,  right  versus  left.  An¬ 
swers  to  the  first  three  items  are  already  marked  on  the  answer  sheet  to 
illustrate  the  method  of  answering.  The  back  of  the  test  sheet  contains  80 
subtraction  and  division  problems  with  5-alternative,  multiple-choice  re¬ 
sponses.  The  examinee  is  to  blacken  the  space  for  the  correct  answer.  An¬ 
swers  to  the  first  two  of  *hcse  problems  arc  premarked  to  illustrate  the 
method  of  answering.  Because  of  the  extremely  low  absolute  difficulty  of 
the  problems,  the  time  limits  were  made  short,  thus  making  the  test  highly 
speeded.  The  following  problems  arc  typical  of  the  content  of  this  test. 

Freni  Betk 

c  W  Subtract:  63 — 38  :  25  21  29  32  26 

n+i9+22=s2  zr:  ■  nr  —  n  n  n  n 


Multiply: 

139 

Divide : 

233  +  7: 

7 

W 

— 

39 

37  33ft  37ft 

35 

- — 

.  ■  - - • 

— 

973 

C 

- - 

— 

-  —  - 

*— 

Subtract : 

93 

Answer 

Add: 

12 

19 

5£ 

« = « 

28 

c 

— 

*= « 

59 

w 

- - 

34  rr 

(2)  Administration. — Instructions  arc  printed  upside  down  wsth  re¬ 
spect  to  the  test  problems  so  strict  control  can  be  maintained  on  working 

*TVi  test  embrsced  minor  rrrisioni  of  Form  A  wbicb  W  been  constructed  *7  Oop*r»t»*« 
Test  Serrie*. 


time.  The  examinees  are  permitted  to  go  on  to  the  back  of  the  sheet  if 
they  finish  the  front  before  time  is  called.  They  are  also  allowed  to  go 
back  to  any  part  of  the  test  to  check  or  correct  their  answers  if  time  is 
available.  The  time  allowance  is  5  minutes  each  for  the  front  and  back  of 
the  test  sheet. 

(3)  Scoring. — The  test  was  first  scored  R— 3\V,  and  later  (R— 3W)/2 
in  order  to  obtain  a  smaller  range  of  scores. 

Statistical  results. — Extended  use  of  this  test  made  possible  the  accu¬ 
mulation  of  a  large  amount  of  statistical  data.  Only  samples  of  these  data 
are  given. 

( l )  Distribution  of  scores. — Administered  to  unclassified  aviation  stu¬ 
dents,  the  test  yielded  the  typical  distribution  constants  given  in  table  6.5. 


Table  6.5. — Distribution  Statistics  for  Numerical  Operations,  C1701B,  using  the 

scoring  formula  (R  -  3W)/2 


Group 

Part 

N 

M 

SD 

1,520 

2.J76 

888 

16.9 

S.S 

it.  5 

6.) 

West  Point  cadets,  class  of  1946  .. 

22.7 

5.6 

Back  . 

1,148 

1,14) 

888 

15.4 

6.0 

16.) 

6.0 

West  Point  cadets,  class  of  1946  . . 

Back  . 

22.) 

5.9 

nwKun  situ  I7i«  a  s/i  iivm  ur  ivsm  ninauii  vm<  *  *  vt 

*  Tested  in  September  and  October  1942  at  Psychological  Research  Unit  No.  X 
'Tested  in  August  and  September  1942  at  Psychological  Research  Unit  No.  X 

*  Tested  in  December  1942  at  Psychological  Research  Unit  No.  5. 


(2)  Optimal  scoring  formula. — Studies  to  determine  the  scoring  form¬ 
ula  to  maximize  validity  yielded  the  results  given  in  table  6.6.  For  scor¬ 
ing  purposes,  weights  of  approximately  —3.0  for  the  front  and  —2.0  for 
the  liack  are  recommended  for  the  wrongs  score  when  the  test  is  used  for 
the  selection  of  navigators.  For  lionibardicrs,  weights  of  approximately 
—0.5  for  the  front  and  0  for  the  l»ack  are  recommended  for  wrongs  score. 
Tabic  6.6  gives  the  data  on  which  these  statistics  arc  based. 


Taw.f.  6.6. —  Data  pertaining  to  the  derivation  of  optimal  weights  for  wrongs  scare 
of  Numerical  Operations,  CI701B* 


Sample 

N 

Part 

M. 

SD, 

snr 

*. 

(R+#\V)« 

Navigators  . . 
Navigators  .  j 
Bombardiers 
Bombardiers  J 

8.1* 
*.18 
97* 
97,  j 

Front 
Hack  . 
Front 
Back  . 

45.50 

42.20 

44.45 

)!.!0 

1)1 

1.11 

1.65 

1.17 

11.12 

10.14 

11.95 

10.20 

l.W 

1.10 

1.59 

1.47 

0)6 

.47 

.16 

.1* 

-0.12 

-.11 

.00 

-.Cl 

0.04 

.0) 

0)8 

.49 

.16 

.16 

-194 

-1.98 

-.44 

-.08 

t  Symbols  used  in  this  table  arc  as  follow.:  R  =  lighlt  scot*;  W  — wrong  *  score;  C  =  criterionl 
and  a  =  weight  for  wrongs  scar*. 


Table  6  7. — Fstimates  of  Reliability  of  \ umerical  Operations,  Cl/OIB 


Type 

'a 

rrt 

Separately  timed  Halres  (IrwO* . 

Separately  timed  halves  (back)* . 

1.176 

1.176 

712 

0.48 

.66 

t  S  S  s 

0.64 

.79 

.8) 

712 

•  s  t  1 

J  S 

4.774 

.68 

•  •  •  * 

~  H.L'Ti:i - 

1  Special  adnunistntion  ctrtied  eut  it  Medical  and  Psycbolockal  Eaassieing  Unit  Na.  i. 
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Table  6,9. — Validity  data  for  Numerical  Operations,  C17Q1B,  using  grades  in 
navigator  trainitig  as  criteria,  for  a  sample  of  463  trainees  in  classes  43-10  through 
43-15  (at  Hondo  Army  Air  Field) 


Criterion 


Grade*  in  dead  reckoning  (ground  ackoot)  .......  Front 

Back  • 

Grade*  in  cele*tlal  navigation  (ground  school)  ....  Front 

Back  ■ 

Grade*  in  dead  reckoning  (flight)  . .  Front 

Back  . 

Grade*  in  celestial  navigation  (flight) .  Front 

„  ,  ,  ,  Back  . 

Grades  in  meteorology . Front 

Uaek  . 

Military  grades . . . Front 

«...  ,  Baek  . 

Final  composite  grades .  Front 

Back  . 


*  Product-moment  correlations. 

1  Assumed  unrestricted  stanine  standard  deviation  not  reported. 


Table  6.10. — Validities  of  Numerical  Operations,  CI701B,  for  certain  technical 

specialties 


(3)  Reliability. — Two  methods  of  estimating  reliability  produced  ap¬ 
proximately  the  same  results,  as  indicated  by  *hc  data  in  table  67.  Retest¬ 
ing  of  the  cample  of  712  in  table  67  was  done  after  approximately  30 
days’  time.  Although  front-back  correlation  is  not,  strictly  speaking,  an 
estimate  of  a  reliability,  the  true  reliability  of  the  parts  is  probably  no  less 
than  the  correlation  between  them. 

(4)  Factorial  composition. — This  test  is  one  of  the  few  relatively  pure 
tests.  Little  or  no  significant  variance  appears  in  any  factor  other  than 
the  one  so  characteristic  of  this  test  and  of  other  mathematical  and  numer¬ 
ical  tests— the  numerical  factor.  In  two  analyses,  in  which  total  score  on 
the  test  was  used  for  determining  intcrcorrclations,  a  weighted  average 
of  loadings  on  the  factor  is  0.66.  In  these  analyses  several  other  factors 
have  slight  loadings,  but  the  communality  is  relatively  low  (0.58)  for  the 
test.  In  two  other  analyses,  separate  front  and  back  scores  were  used  as 
the  basis  of  intcrcorrclations.  In  these,  weighted  averages  of  the  factor 
loadings  are  0.80  for  the  front  section  and  0.82  for  the  back.  Even  smaller 


Specialty 


Criterion 


Air  mechanic  armorer1 


Radio  operator-mechanic1  ... 


Air  mechanic  armorer1 


Radio  operator-mechanic1  . . . 


Flexible  gunnery1 
Flexible  gunnery* 
Flexible  gunnery* 
Flexible  gunnery* 
Flexible  gunnery* 
Flexible  gunnery* 


Average  grade*  .  Front 

Average  grade*  . . .Front 

Pass  (ail  . Front 

Average  grades  .  Front 

Average  grades  . Back  . 

Average  grade*  . JBack  . 

Pass-fail  . . ...jRack  . 

Average  grade*  . Rack  . 

Air-to-air  . Front 

Final  Examination  . Front 

Composite  ground  range  ....  Front 

Jam  handy  trainer . Front 

Air-to-air  . Back  . 

Final  examination . Back  . 


1  Tested  at  Psychological  Research  Unit*  Nos.  1.  %  and  L 

*  In  class  43—48,  tested  at  Psychological  Research  Unit*  Not.  t,  L  and  S. 

*  In  class  43-45,  tested  at  Psychological  Research  Units  No*.  1,  2,  and  J. 
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amounts  of  variance  arc  accounted  for  by  other  factors,  a  fact  indicated 
by  the  cotnmunalities  of  the  parts,  0.68  for  the  front  and  0.71  for  the 
back.  From  these  indications,  it  appears  clear  that  this  lest  is  a  much 
purer  measure  of  a  single  factor  than  is  commonly  achieved. 

(5)  Test  validity. — In  view  of  the  extended  use  of  this  test  for  classifi¬ 
cation,  abundant  data  are  available  on  its  validity.  Tables  6.8  to  6.10  give 
typical  validation  results  for  aircrew  and  technical  specialties  respectively. 

Evaluation  of  the  test. — As  a  relatively  pure  measure  of  the  numerical 
factor,  this  test  appears  to  be  the  best.  As  is  true  for  most  other  tests  that 
have  high  loadings  with  that  factor,  the  usefulness  of  the  test  is  probably 
restricted  to  predicting  success  in  pursuits  that  require  rapid  use  and 
manipulation  of  numerical  symbols.  The  actual  importance  of  this  func¬ 
tion  to  a  task  can  be  ascertained  by  correlations  with  a  test  such  as  this. 

Variations  of  the  test. — Other  forms  of  this  test  differ  in  minor  re¬ 
spects  only  from  the  one  just  described,  as  indicated  in  the  following 
paragraphs.  " 

(1)  Numerical  Operations ,  CI701A — Form  S. — This  form  was  de¬ 
signed  by  the  Cooperative  Test  Service  and  is  very  similar  in  all  respects 
to  CI701B  already  described.  It  was  used  for  classification  purposes  fora 
short  time  prior  to  the  development  of  form  0701 B. 

(2)  Numerical  Operations ,  CI701BX1". — This  is  an  experimental 
test,  developed  to  measure  the  numerical  factor.  It  contains  17  addition, 
16  multiplication,  16  subtraction,  and  16  division  problems,  plus  8  prob- 

....  .  6,125  X  8 

lenis  involving  more  than  one  process,  e.  g.  _ _  _ 

30  X  15 

Multiple-choice  answers  are  listed  for  all  problems  in  this  test,  Scores  in 
this  form  correlate  0.71  with  form  0701 B  front  and  0.79  with  form 
CI701B  back  (N=298  in  both  cases). 

Numerical  Approximation,  CI706A  11 

This  test  is  designed  to  measure  the  student’s  ability  to  estimate  quickly 
the  accuracy  of  results  of  fairly  simple  arithmetic  operations.  It  differs 
from  numerical  operations  in  that  emphasis  is  placed  upon  estimation 
rather  than  upon  computation.  Bombardiers  ami  navigators  frequently 
must  make  computations  under  pressure  and  in  limited  time.  It  is  thus 
important  that  they  be  able  to  check  their  work  quickly.  Gross  errors — 
tlic  most  serious  ones — arc  usually  detectable  because  of  the  unreasonable' 
ness  of  the  results.  For  example,  misplaced  decimal  i>oints  and  similar 
errors  should  be  apparent  to  one  who  secs  the  problem  as  a  wliolc  and  is 
able  to  estimate  within  reasonable  limits  the  results  of  arithmetic  opera¬ 
tions.  The  cues  to  discrepancies  arc  many — numbers  of  digits,  sire  of  first 
and  last  digits,  position  of  decimal  points,  and  the  like. 

*  Drrtlopcd  it  PiyckoWf  icU  Rcttirck  Unit  Nt.  J.  Chief  cantrihuiori:  U.  Dni4  H.  Jtt- 
lw>.  S*t.  Bruy  S»ik. 

11  it  l’iicl»lo|i(il  Rttmtk  Unit  N».  1.  Chief  (wmibutMi:  T/S|t.  Paul  C. 

Don,  U  Lina  HiMtkiiw*. 
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Description. — The  items  of  this  lest  require  more  complex  compute* 
lions  than  those  found  in  numerical  operations  and  thus  simulate  more 
nearly  the  type  of  problems  familiar  to  navigators  and  bombardiers.  To 
minimize  the  sheer  numerical -operation  component,  directions  encourage 
the  examinee  to  estimate  the  answers  roughly,  not  taking  time  to  com¬ 
pute  them.  The  time  limit  also  is  set  so  short  tliat  those  who  stop  to  com¬ 
pute  the  exact  answers  inevitably  fail  to  complete  trough  items  to  obtain 
a  good  score. 

(1)  Internal  characteristics. — The  test  consists  o'  15  scored  items.  The 
processes  involved  include:  (1)  Addition,  (2)  subtraction,  (3)  multipli¬ 
cation,  (4)  division,  (5)  proportions,  and  (6)  roots  and  powers.  In  over 
lialf  of  the  problems  more  than  one  process  is  involved,  as  in  the  follow¬ 
ing  examples : 

1,000:2  =  9,950: - 

A.  1.89. 

B.  4.975. 

C.  9.95. 

D.  19.9. 

E.  49.75. 

8,000  X  (.96288— .94208)  X  1  =  - - 

A.  20. 

B.  80. 

C.  160. 

D.  208. 

E.  344. 

(2)  Administration. — Two  sample  problems  are  given  in  the  directions, 
and  the  procedure.,  in  their  solution  are  explained.  Emphasis  is  placed 
upon  the  necessity  for  speed  and  the  desirability  of  estimating  results 
rather  than  computing  them  exactly.  Testing  time  for  the  15  items  is  10 
minutes. 

(3)  Scoring. — The  test  is  scored  with  the  formula  2R— W/2, 

Statistical  results. — Although  this  test  appeared  for  only  a  short  time 

in  the  classification  battery,  considerable  statistical  data  were  obtained. 

(1)  Distribution  statistics. — Based  on  administration  (at  Psychological 
Research  Unit  No.  2,  in  August  and  September  1942)  to  a  typical  sample 
of  1,520  unclassified  aviation  students,  the  test  yielded  a  mean  score  of 
10.9  and  standard  deviation  of  5.6.  The  distribution  was  approximately 
symmetrical. 

(2)  Test  reliability.— An  odd-even  estimate  of  reliability,  on  the  basis 
of  200  cases,  yielded  a  corrected  coefficient  of  0.61.  Since  the  test  is 
speeded,  this  is  an  overcst  insale.  The  presence  of  six  apparently  different 
types  of  items  and  the  extreme  shortness  of  the  test  (15  items)  may 
separately  or  jointly  account  for  this  relatively  low  figure. 

(3)  Factorial  composition.— This  test  was  not  included,  as  such,  in  any 
factor  analysis,  so  no  information  is  available  regarding  its  factor  content 
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It  is  probable,  however,  that  the  test  would  have  a  relatively  high  loading 
in  the  numerical  factor  best  identified  with  the  Numerical  Operations  test. 
In  the  classification  battery  the  Numerical  Approximation  test  was  com¬ 
bined  with  Arithmetic  Reasoning  to  yield  a  single  score  for  Mathematics 
B.  The  higher  loading  in  the  numerical  factor  (0.59)  for  the  Mathematics 
B  test  that  contained  Numerical  Approximation  than  for  the  Arithmetic 
Reasoning  test  alone  (0.53)  tends  to  confirm  this  belief. 

(4)  Test  validity. — Validity  data  obtained  for  this  test  are  given  in 
tables  6.11  and  6.12. 


Table  6.11. —  Validity  of  Numerical  Approximation,  C1706A,  for  primary  pitot 
training  (graduation-elimination  criterion) 


K 

M. 

SD, 

1.520* 

075 

11.1 

10.3 

5.6 

0.09 

J.I4S* 

.76 

10.7 

10.2 

5.4 

.M 

1  In  Hast  43  D,  tc'lcct  si  INycholotftcal  Rc**arch  (Jfijt  No.  2. 
•In  cU»i  4J-K,  toted  at  Psychological  Research  Unit  No.  2. 


Table  6.12. —  Validity  of  Numerical  Approximation,  C1706A,  for  prediction  of 

combat-crrw  training  success 


Group 

Criterion 

N 

r 

675 

0.14 

675 

.31 

675 

.10 

} 

(  .36 

>  from  68  to  131 

1  .04 

Flexible  (Winery  . . 

Air-to-xir  firing  . 

mHHHI 

i  M 

•a*'**  !)»  t  la  D6-7. 

*  CUvj  l]  at  l-u  Vci»*i  Unitk  Guonwy  xW, 


(5)  Item  validity. — Although  no  actual  validity  outa  against  navigator 
success  arc  available  for  this  test,  items  were  correlated  with  average 
academic  grades  in  navigation  p.efiiglit  training.  Results  indicated  that  the 
test  should  be  a  relatively  good  predictor  of  success  in  at  least  the  aca¬ 
demic  plinscs  of  navigation  training  The  mean  phi  value  for  the  15  items 
was  0.17,  the  standard  deviation  was  0.11,  and  the  range  from  0.04 
to  0.45. 

Ltiil tuition. — Tliis  test  is  proltablv  not  significantly  different  in  function 
from  Numerical  Operations.  The  question  as  to  which  is  the  purer  measure 
of  the  numerical  factor  and  which  is  the  better  predictor  of  success  in 
navigation  cannot  be  answered  on  the  basis  of  available  data.  If  further 
research  should  reveal  this  test  to  be  a  purer  measure  of  the  numerical 
factor  than  Numerical  Ojxrations,  its  usefulness  as  a  selection  instrument 
would  be  demonstrated.  During  the  |>crir>d  when  this  test  was  used  in  the 
classification  battery,  the  score  in  Arithmetic  Reasoning  (Mathematics  B) 
included  the  score  in  this  test.  For  this  reason,  no  independent  data  for 
this  test  were  obtained  during  that  period.  Factorial  content  of  the  com¬ 
posite  score  is  discussed  in  the  chapter  on  reasoning. 
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Variations  of  the  test. — Certain  preliminary  forms  of  this  test  were 
constructed,  differing  little  in  purpose  or  technique  from  the  form  already 
described. 

(1)  Numerical  Approximation ,  C1706AXI11. — This  first  form  of  the 
test  consists  of  30  multiple-choice  items  of  the  type  described  under  form 
CI706A.  The  form  is  moderately  homogeneous,  yielding  a  mean  internal- 
consistency  phi  of  0.42  with  a  standard  deviation  of  0.10.  The  difficulty  is 
approximately  optimal,  the  mean  difficulty  index,  corrected  for  chance,  be¬ 
ing  0.51  and  the  standard  dev  iation,  0.21,  both  based  on  the  proportion  of 
examinees  responding  to  the  items. 

(2)  Numerical  Approximation ,  CI706AX2". — This  is  a  revision  of 
the  AX1  form,  containing  15  items.  It  does  not  differ  significantly  from 
the  classification  form  (C1706A). 


SUMMARY  AND  EVALUATION  OF  MATHEMATICS  TESTS 


As  evidenced  by  the  results  discussed  in  this  cliapter,  mathematical  and 
numerical  tests  proved  to  be  very  valid  predictors  of  navigator  success.  In 
view  of  this  fact,  most  of  the  weight  of  prediction  for  navigators  rested 
upon  mathematics  tests  in  the  early  classification  batteries. 

In  contrast  to  prediction  of  success  in  the  other  two  air-crew  positions, 
a  high  degree  of  validity  in  predicting  navigation  success  was  attained 
by  using  a  very  limited  number  of  tests.  Important  among  these  were 
Numerical  Operations,  Numerical  Approximation,  and  General  Mathe¬ 
matics  tests.  After  these  tests  had  been  used  for  some  lime  and  factorial, 
as  well  as  validity,  data  had  been  gathered,  it  became  evident  that  there 
was  considerable  overlap  among  them.  Because  the  Numerical  Opera¬ 
tions  test  seemed  the  purest  of  the  three,  and  because  alone  it  could  carry 
the  full  burden  of  measuring  the  numerical  factor,  the  other  two  tests 
were  dropped  from  the  battery.  Scores  in  this  lest  were  weighted  heavily 
in  classification  of  navigators  but  less  heavily  for  bombardiers. 

Possibly  the  most  important  contributions  of  the  research  described  in 
this  cliapter  were  the  discoveries  that  ( 1 )  the  numerical  factor  in  itself 
is  exceptionally  valid  for  navigator  selection,  and  (2)  that  most  mathe¬ 
matical  tests  derive  a  large  part  of  their  validity  from  this  factor.  Other 
valid  factors,  but  much  less  prominent,  in  mathematical  tests  are  general- 
reasoning,  mathematics-background,  and  verbal  factors,  all  of  which 
have  some  validity  for  navigator  selection.  All  of  these  are  better  meas¬ 
ured  by  means  of  ncnmathematical  tests.  The  only  really  unique  contribu¬ 
tion  of  mathematics  tests,  then,  is  tlie  numerical  factor.  In  the  light  of 
this  fact,  it  is  evident  that  greatest  economy  can  be  achieved  by  using  the 
purest  possible  test  of  that  factor.  Of  the  mathematical  tests  employed  in 
the  classification  program,  tlic  Numerical  Operations  lest  appears  to  be 
the  most  satisfactory  from  this  standpoint 


u  Dtvrloiwt  ii  ISrcholofi'al  (fwiitli  Unit  X*.  I.  Chid  cooliitmi*«»: 
Davit.  Ll  Lina  Hutchintnn.  M»J.  Mrrtill  F  *•». 

u  DtTtiwed  at  |*»rtk*U*tcal  lixuti  Ciul  N*.  J.  Chief  (wlnWnn. 
Davit,  Li.  Lina  Haul 
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It  may  be  surprising  to  some  that  a  supposedly  highly  intellectual  task 
such  as  mathematics,  as  measured  by  Mathematics  A,  shows  significant 
variance  in  the  verbal,  numerical,  mathematics-background,  and  visuali¬ 
zation  factors  only.  In  view  of  the  estimated  reliability  of  the  mathematics 
test  (0.92  for  CI702E)  and  its  communality  (0.63),  it  is  true  that  an¬ 
other  factor  or  factors,  as  yet  unidentified,  may  account  for  considerable 
variance  of  the  test.  It  is  significant  to  note,  however,  that  the  validity  of 
the  test  for  navigator  selection  (0.42)  is  entirely  accounted  for  by  the 
known  factors.  It  is  apparent,  then,  that  whatever  now  undefined  factors 
enter  into  the  factorial  composition  of  the  test,  such  factors  arc  unrelated 
to  navigation  success. 
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CHAPTER  SEVEN _ 

Reasoning  Tests1 


INTRODUCTION 

The  original  impetus  for  the  development  of  reasoning  tests  was  pro¬ 
vided  by  early  formal  and  informal  job  analyses  that  indicated  the  im¬ 
portance  to  navigation  of  accurate  reasoning  with  words  and  numbers. 
The  job  analysis  data  presented  in  chapter  1  (see  especially  tables  1.3  and 
1.4)  are  sufficient  support  for  the  expectation  that  tests  of  reasoning, 
especially  arithmetic  reasoning,  would  have  moderate  to  high  correlations 
with  navigator  criteria. 

It  was  originally  thought,  too,  that  successful  performance  in  any  air¬ 
crew  position  required,  among  other  traits,  the  ability  to  reason  rapidly 
and  accurately.  It  was  believed  that  reasoning  was  involved  in  many  in¬ 
stances  of  what  the  pilot  instructors  called  judgment,  particularly  where 
decisions  were  required.  The  major  emphasis  placed  upon  judgment  in 
pilot  training  justified  efforts  to  discover  what  types  of  reasoning  tests 
might  cover  aspects  of  judgment.  Later  job  analysis  data,  however,  did 
not  entirely  support  this  line  of  thought  (see  tables  1.2  and  1.6)  nor  did 
later  test  results. 

While  the  worth  of  arithmetic-reasoning  tests  for  selecting  navigators 
became  apparent  very  quickly,  they  had  very  little  validity  for  the  pilot 
criterion.  The  hypothesis  was  proposed  that  this  failure  of  reasoning  *ests 
to  predict  success  in  pilot  training  was  due  to  the  fact  that  they  were 
couched  in  verbal  and  numerical  terms,  and  that  neither  verbal  nor  nu¬ 
merical  abilities  had  any  relation  to  the  success  or  failure  of  pilot  trainees. 
An  intensive  effort,  therefore,  was  made  to  develop  and  validate  non- 
numerical  and  nonverbal  reasoning  tests.  Most  of  the  tests  discussed  in 
this  chapter  were  developed  in  this  search  for  a  reasoning  test  \alid  for 
pilot  selection. 

The  informed  reader  will  note  that  most  of  the  tests  arc  not  new  in 
type  of  content  or  underlying  rationale.  This  is  attributable  to  the  fact 
that  reasoning  tests  had  been  subjected  to  a  great  deal  of  investigation  in 
previous  decades.  It  was  felt  desirable,  therefore,  to  adapt  the  most  ap¬ 
propriate  of  these  for  the  initial  study  of  the  relation  of  nonverbal  and 

nonnumerical  reasoning  tests  to  pilot  success. 

Reasoning  tests  that  involve  numerical  and  verbal  variance  will  be  dis¬ 
cussed  first.  Nonverbal,  nonnumerical,  reasoning  tests  will  tlien  be  dis¬ 
cussed,  following  which  will  be  presented  a  factor  analysis  of  both  types 

of  tests. 

*  Written  ky  C*pL  J*k»  L  L**r  CM-  J**“* 
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NUMERICAL  A  v!)  VERBAL  REASONING  TESTS 


Arithmetic  RrnMUiitig,  CI206C  * 

The  predominance  of  mathematics  in  the  training  and  duties  of  navi¬ 
gators  insured  the  development  of  some  type  of  mathematics  test  in  the 
initial  phases  of  the  classification  and  selection  program.  The  first  classi¬ 
fication  battery  included  a  mathematics  test  which,  after  several  transi¬ 
tional  forms,  became  General  Mathematics,  Cl 702 E. 3  The  early  forms 
of  this  test  included  both  arithmetic-reasoning  problems  that  could  be 
solved  with  minimal  formal  mathematical  training  and  achievement  prob¬ 
lems  requiring  the  use  and  understanding  of  at  least  high-school  mathe¬ 
matics,  c.  g.f  logarithms,  algebraic  manipulations.  The  test  was  weighted 
more  heavily  with  the  latter  type  of  problem,  thus  making  it  primarily  an 
achievement  test.  It  was  thought  desirable  to  construct  separate  tests,  one 
an  achievement  test,  and  the  other  an  arithmetic-reasoning  test.  Arithmetic 
Reasoning.  CI206C,  is  the  final  form  of  the  latter  type  of  test.  It  was 
designed  to  lie  a  more  difficult  form  than  its  immediate  predecessors, 
CI206A  and  R,  in  order  to  provide  belter  differentiation  among  superior 
candidates  for  training  in  navigation.  The  test  was  included  in  the  Classi¬ 
fication  Rattery  of  July  19-13,  and  it  has  been  used  since  that  time. 

Description. — The  test  e< insists  of  30  arithmetic-reasoning  problems. 
As  examples  of  the  test  problems,  an  easy  problem  and  a  difficult  problem 
follow. 

If  a  plane  is  to  fly  132  miles  in  45  minutes,  what  must  he  its  average  speed? 

A.  1467  m.  p.  h. 

B.  164  m.  p.  h. 

C.  165  m.  p.  h. 

D.  176  m.  p.  h. 

E.  1S2  m.  p.  h. 

A  plane  traveled  a  certain  distance  from  the  base  at  an  average  rate  of  225 
miles  per  hour.  Engine  trouble  forced  it  to  return  at  an  average  rate  of  150  miles 
per  hour.  It  left  at  11:35  am.  and  returned  at  12:05  pan.  How  far  away  from  its 
base  was  the  plane  when  i;  turned  back? 

A.  30  miles 

I).  45  miles 

C.  50  miles 

D.  75  miles 

E.  90  miles 

(1)  Internal  characteristics. — The  items  of  the  test  arc  arranged 
roughly  in  order  of  increasing  difficulty.  They  arc  formulated  in  aviation 
terms  in  the  interest  of  face  validity.  All  problems  are  presented  simply 
and  concisely,  in  an  attempt  to  minimize  verbal  variance. 

(2)  Administration. — The  test  is  printed  as  the  second  half  of  a  book¬ 
let,  the  first  half  of  which  is  Mathematics,  CI702F.3  The  first  half  is 
known  as  Mathematics  A,  and  the  second  halt,  as  Mathematics  B.  The 

1  Drf rlcy'd  il  P*jrc!  losirii  Irtrird  Unit  No.  J.  CV,itf  Conlfifculort:  Capt.  Lloyd  G 
Hun»pkf»r».  t-l.  Danil  II  J*nVii>»,  Inn  R.  t.jrona, 

■  Sot  ckapttr  i  lor  t  dmutaion  ol  tbit  teat  and  its  prtdccraaorv 
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two  tests  arc  timed  separately.  Scratch  pajH-r  is  provided  to  all  examinees. 
The  time  limit  for  Arithmetic  Reasoning  is  set  at  35  minutes. 

(3)  Scof'.-’.g. — The  scoring  foimtila  is  2R — W/2. 

Statistical  results. — Due  to  its  inclusion  in  the  classification  battery, 
voluminous  statistical  data  arc  available  on  this  test.  Typical  data  arc  given 
below. 

(1)  Distribution  statistics. — Typical  distribution  statistics  arc  given  in 

table  7.1. 


Table  7.1. —  Distribution  constants  for  Arithmetic  Reasoning,  CI206C 


Uncla, sifted  avialion  iludcm*1  .. 
Unclajaifitd  aviation  ltudenu'  .. 
t’nclav  ified  aviation  students1  .. 
West  Point  cadets.  Class  o I  1946 


1  Tc-ted  with  the  November  1941  Classification  llutciy  at  Medical  and  Psychological  Examin¬ 
ing  Units  Not.  4  through  10. 

■  Tested  with  the  November  1941  Oassification  Dattcry  at  Psychological  Research  Units  Nos. 
1,  2,  and  J. 

1  Tested  with  the  July  1941  Classification  Oattery  at  Psychological  Research  Units  No*. 
1.  2,  and  1. 

(2)  Internal  consistency. — The  degree  of  homogeneity  of  the  items  is 
indicated  by  a  mean  internal-consistency  phi  coefficient  uf  0.50,  a  stand¬ 
ard  deviation  of  0.11,  and  a  range  of  values  from  0.17  io  0.75.  These 
statistics  arc  based  upon  the  responses  of  the  highest  25  jKTCent  and  the 
lowest  25  percent  in  total  score  of  a  group  of  -ISO  unclassified  aviation 
students,  tested  on  June  23  and  24,  1943,  at  Psychological  Research  Unit 
No.  3. 

(3)  Reliability  coefficient. — Two  estimates  of  reliability  are  given  in 
table  7.2. 


Table  7.2.—  Reliability  data  for  Arithmetic  Reasoning,  C1206C,  based  upon  samples 

of  unclassified  ax'ialion  students 


N 

Type 

'a 

r* 

0.63 

0.77 

.73 

-64 

1  Tutcd  it  Medical  and  P-ycholonicil  Examining  Unit  No.  10  »:th  the  November  I94J 
^>1  Te-ted  »t  Medical  and  P.ych.dogicJ  Examining  Unit  No.  7  fiom  January  30.  1944  to  Feh. 


14,  1944. 


(4)  Difficulty. —  Based  upon  item  analysis  of  the  jwjtcrs  of  1.292  classi¬ 
fied  pilots,  the  mean  proportion  of  correct  i  v>|>onscs,  corrected  for  chance 
success,  is  0.57,  with  a  standard  deviation  of  0.30  and  a  range  from 
0.00  to  0.92, 

(5)  Factorial  composition. — The  most  significant  loadings  arc  in  the 
numerical  (0.48),  general-reasoning  (0.47) ,  verbal  (0.27),  and  visu¬ 
alization  (0.19)  factors.  It  is  important  to  note  that  the  test  has  a  load¬ 
ing  of  only  0.12  in  the  mathematical  background  factor.  The  comnninality 
is  0.72,  to  be  compared  with  the  two  reliability  estimates  ftf  0.84  and  0.77. 
For  a  full  picture  of  the  factorial  compo>ition  of  this  test,  see  appendix  B. 
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(6)  Test  validity. — Validation  results  based  on  several  samples  are 
given  in  table  7.3. 


Tasi.e  7.3. —  Validity  Data  for  Arithmetic  Rcastminu,  C1206C 


Group 

Criterion 

N. 

IS 

M, 

M. 

SI), 

fH» 

Pilots  in  primary  training1 
Pilots  in  primary  training1 
Pilots  in  primary  training4 
Pilots  in  primary  training* 

WASPs’  . 

Armorers  in  training1  .... 
Officer  candidates11  ...... 

Graduation-elimination 
Gradu  itu  a-elimination 
( iraduation-eliimnation 
Graduat  ion- elimination 
Graduation-elimination 

4,779 

2,346 

3,146 

1,823 

104 

269 

343 

0.88 

.74 

.84 

.80 

.61 

m 

n 

9.02 

9.04 

9.04 

8.92 

8.46 

HI 

He] 

n 

Kighch  week  academic 
average  .  . . . 

1 

Hi 

m 

1  In  Class  44-E,  tested  witb  the  July  1943  Gasification  Battery  at  Psychological  Research 
Unit  No.  3. 

*  Assuming  an  unrestricted  stanine  standard  deviation  of  2.00. 

1  In  Class  44-E,  tested  with  the  July  1943  Classification  Battery  at  Psychological  Research 
Unit  No.  1. 

*  In  Gass  44-1,  tested  at  Psychological  Research  Units  Nos.  1,  2,  and  3. 

•Assuming  an  unrestricted  stanine  standard  deviation  of  1.82. 

•In  Class  44-E,  tested  with  the  July  1943  Classification  Battery  at  Psychological  Research 
Unit  No.  2, 

•  In  Class  44-W-8,  tes'ed  by  Medical  and  Psychological  Examining  Unit  No.  8. 

•Tested  at  Medical  ana  Psychological  Examining  Units  Nos.  1  through  10.  In  Lowry  Field 
armament  Gasses  34-44A  and  35-44A. 

•  Product-moment  correlation. 

“  In  training  at  Miami  Beach,  Class  44-E.  Tested  at  Medical  and  Psychological  Examining 
Unit  No.  S. 


Variations. — Two  forms  directly  preceded  Arithmetic  Reasoning, 
CI206C.  The  initial  form,  Arithmetic  Reasoning,  CI206A,4  was  adminis¬ 
tered  experimentally  to  unclassified  aviation  students  and  subjected  to 
stem,  analyses  of  difficulty  and  of  internal  consistency.  Items  showing  the 
highest  internal-consistency  phi  coefficients  and  the  most  appropriate 
difficulties  were  combined  with  carefully  selected  new  items  to  make  the 
first  permanent  form  of  the  test,  CI206B.4  This  form  entered  the  classi¬ 
fication  battery  in  August  1942,  to  be  replaced  in  July  1943  by  Form  C. 

Description. — Form  B,  like  Form  C,  has  30  arithmetic-reasoning 
problems. 

(1)  Internal  characteristics . — The  items  of  this  test  are  also  arranged 
roughly  in  order  of  increasing  difficulty  and  expressed  simply  and  con¬ 
cisely  iri  aviation  terms. 

(2)  Administration. — Arithmetic  Reasoning,  CI206B,  like  Form  C, 
was  administered  with  an  achievement  test  in  the  same  test  booklet.  The 
achievement  test  was  Mathematics,  CI702E,8  and  was  known  as  Mathe¬ 
matics  A.  Mathematics  B  included  not  only  Arithmetic  Reasoning, 
CI206B,  however,  but  also  Numerical  Approximations,  CI706A.5  Mathe¬ 
matics  B  was  administered  with  a  time  limit  of  35  minutes. 

(3)  Scoring. — Two  equivalent  scoring  formulas  were  used:  R — W/4 
and  2R — W/2.  From  August  1942  to  December  1942,  separate  scores 
were  secured  for  Numerical  Approximations  and  for  Arithmetic  Reason¬ 
ing.  From  December  1942  to  July  1943,  the  score  for  Mathematics  B  was 
the  sum  of  the  unweighted  component  scores  in  the  two  tests. 

•Developed  ■(  Psychol-  :ical  Research  Unit  No.  3.  Chief  contributors:  Capt.  Milton  Burd* 
man.  Capt.  Lloyd  G.  Humphreys,  and  Ma}.  Merrill  F.  Rolf. 

•  -ct  chapter  6  for  a  discussion  of  tbla  teat. 
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Statistical  results. — Statistical  data  arc  available  both  for  Arithmetic 
Reasoning,  CI206B,  alone,  and  for  the  combination  of  it  with  Numerical 
Approximations,  CI706A. 

(1)  Distribution  statistics. — Typical  distribution  statistics  are  presented 
in  table  7.4. 


Table  7.4. —  Distribution  constants  for  Arithmetic  Reasoning,  C1206B,  based  upon 
samples  of  unclassified  aviation  students  (scored  R-lV/4) 


N 

M 

SD 

‘I.S20 

12.7 

IS 

*2,376 

12.4 

SJ 

‘Tested  in  August  and  September  1942  at  Psychological  Research  Unit  Ko.  2* 
*  Tested  in  October  1942  at  Psychological  Research  Unit  No.  2. 


(2)  Internal  consistency. — Data  are  available  for  arithmetic  reasoning 
scores  alone.  For  a  sample  of  400  unclassified  aviation  students  the  mean 
phi  coefficient  was  0.42,  with  a  standard  deviation  of  0.14  and  a  range 
from  0.10  to  0.73.  These  data  are  based  on  the  highest  25  percent  and  the 
lowest  25  percent  of  the  groups  in  total  score. 

(3)  Reliability  coefficient. — A  sample  of  200  unclassified  aviation  stu- 
dents  yielded  an  odd-even  estimate  of  reliability  of  0.80,  corrected  for 
length,  for  the  combination  of  CI206B  and  CI706A, 

(4)  Difficulty. — For  a  sample  of  400  unclassified  aviation  students,  the 
test  (CI206B  alone)  yielded  a  mean  proportion  of  correct  responses  of 
0.52,  corrected  for  chance  success,  with  a  standard  deviation  of  0.21  and 
a  range  from  0.05  to  0.89. 

(5)  Factorial  composition. — The  most  significant  loadings  of  Mathe¬ 
matics  B  (combined  scores)  are  in  the  numerical  (0.57) f  general-reason¬ 
ing  (0.40),  verbal  (0.29),  and  visualization  factors  (0.22).  The  com* 
munality  is  0.68,  compared  to  an  estimated  reliability  of  0.80.  Form 
CI206B  alone  was  analyzed  in  matrices  in  which  the  numerical  factor  was 
not  defined.  The  comparable  loadings  are:  general  reasoning,  0.57,  ver- 


Table  7.$.— Validation  data  for  Arithmetic  Reasoning ,  CI206B,  based  on  the 

graduatum-eli*:  '. nation  criterion 


Group 

N, 

>• 

M, 

M. 

SD, 

rHi 

/no 

Navigation  student** . 

Navigation  students*  . ... . . 

Pilots  in  primary  training* 
Pilots  in  primary  training* 
Pilots  in  basic  training*  . . 
Bombardier  students'  .... 

Bombardier  students*  .... 

*1.970 

‘731 

1,520 

1,148 

1,429 

$52 

496 

0.79 

oooo 

.IS 
.76 
•  •  •  • 

.84 

.82 

47.54 

13.1 

12.3 
13.0 

10.1 

13.4 

40.S4 

11.5 

12.2 

12.2 

9.2 

11.9 

12.82 

5.3 

5.3 

5.3 

4,9 

5.3 

0.32 

.32 

.17 

.Ot 

.08 

.11 

.is 

i 

0.48 
.50 
•  •  •• 

•  •  •  • 

•  •  t  • 

too* 

*'*»  Uginjf  combined  score*  In  Arithmetic  Reasoning,  Cl 206 B.  and  Numerical  Approximation, 

CT«7lnAaauc>  43-10  and  4J-U.  Tested  at  PsycbologicalResearcb  UniU  No*.  J  and  2. 

‘In  Class  43- D.  Tested  Aug.  6  to  Sept.  8,  1942.  .t  Psychological  Research  Unit  No.  1. 

•  In  Class  43-E.  Tested  Aug.  6  to  Sept.  8,  1942.  at  Psychological  Research  Unit  No.  2. 

'In  Class  43-F.  Tested  at  Psychological  Research  Unit  No.  2. 

'  In  Gass  43-5-7.  Tested  at  Psychological  Research  Unit  No.  I. 

•  In  Ohm  43-5  7.  Tested  at  Psychological  Research  Unit  No.  2. 
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bal,  0.29,  and  visualization,  0.10.  The  communality  is  0.51.  For  a  full 
picture  of  the  factorial  composition  of  this  test,  see  appendix  B. 


(6)  Test  validity. — Validation  data  arc  presented  in  tables  7.5  and  7.6. 


Table  7.6. —  Validation  data  for  combined  scores  in  Arithmetic  Reasoning,  CI206B, 
and  Numerical  A;  froximation,  CI706A,  against  seven  navigation  grades  for  a 
sample  of  463  navigation  trainees1 


Criterion 

r* 

reorr.* 

Grades  in  Dead  Reckoning  (ground  school)  . 

0.32 

0.52 

tirades  in  Celestial  Navigat.on  (ground  school)  . . . 

.29 

.42 

tirades  in  Dead  Reckoning  (flight)  . 

.23 

.31 

tirades  in  Celestial  Navigation  (flight)  . 

.26 

.38 

tirades  in  Meteorology  . 

.16 

.32 

Military  Grades  . .  . . 

.12 

.19 

I'inal  Compo>  ite  Grades  . . . . . 

.38 

.51 

'In  Hondo  Classes  43-10  through  43-15.  Tested  at  Psychological  Research  Units  Nos.  1, 

2,  and  3. 

1  Product-moment  correlations. 

1  Assumed  unrestricted  staniuc  standard  deviation  not  reported. 

(7)  Item  validity. — Based  on  a  sample  of  1,392  classified  pilots,  and 
using  graduation-elimination  from  primary  training  as  the  criterion 
(1,033  graduates),  the  mean  validity  phi  coefficient  was  0.10,  with  a 
standard  deviation  of  0.05  and  a  range  from  0.00  to  0.27.  i 

Evaluation  of  Arithmetic  Reasoning,  CI206B  and  C. — Arithmetic  rea-  j 
soning  tests  are  among  the  most  valid  predictors  of  success  in  navigation  | 
training.  They  are  exceeded  in  that  function  in  the  classification  battery  | 
only  by  the  Dial  and  Table  Reading  tests  (see  ch.  16.).  This  validity  is  due  j 
primarily  to  the  tests’  loadings  in  the  numerical,  reasoning,  and  verbal 
factors,  and  to  a  small  degree  to  the  visualization  loading.  These  factors  j 
account  for  the  following  percentages  of  the  variance  of  form  CI206C* 

23  percent,  22  percent,  7  percent,  and  4  percent  respectively.  i 

What  small  pilot  validity  the  tests  have  is  due  to  visualization  and  ' 
spatial  loadings,  the  other  factors  having  no  validity  for  pilots.  ; 

The  data  on  the  validity  for  armorers  in  training  and  for  officer  candi-  j 
dates  show,  as  might  be  expected,  that  the  test  is  very  useful  for  evaluat-  ! 
ing  general  academic  aptitude.  v  ; 

It  is  interesting  to  compare  the  factorial  composition  of  Mathematics. 
CI702F,  and  Arithmetic  Reasoning,  CI2C6C.  The  reader  will  remember 
that  the  former  was  designed  to  measure  mathematical  achievement,  and 
the  latter,  quantitative  reasoning  ability.  Mathematics,  CI702F,  has  a 
loading  of  0.24  on  the  reasoning  factor,  whereas  Arithmetic  Reasoning 
has  an  average  loading  of  0.47.  The  achievement  test  has  a  loading  of  0.37 
on  the  mathematical-background  factor;  the  Arithmetic  Reasoning  test 
has  a  loading  of  only  0.12,  which  might  lx*  a  chance  deviation  from  zero 
The  intentions  underlying  the  development  of  the  two  tests,  therefore, 
were  realized  fairly  successfully. 

it  should  he  noted  that  far  better  tests  of  the  numerical,  verbal,  and 
visualization  factors  exist,  hut  that  the  arithmetic-reasoning  tests  best 
define  the  general-reasoning  factor,  albeit  with  moderate  loadings.  It  is 
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hoped  that  a  pure  test  of  'his  factor  will  be  found.  When  it  is,  arithmetic* 
reasoning  tests  will  lose  their  importance. 

Number  Series,  CI215AX1  * 

The  development  of  a  number-series  test  was  undertaken  primarily  for 
the  purpose  of  analyzing  the  area  of  nonverbal  reasoning.  There  was  no 
expectation  that  the  test  would  add  to  the  combined  validity,  especially  for 
navigators,  of  already  existent  numerical  and  reasoning  tests.  The  test, 
based  on  the  well-known  number-series  completion  concept,  promised  low 
verbal  content  and  a  high  loading  in  a  reasoning  factor. 

Description. — Each  problem  in  the  test  consists  of  an  incomplete  num¬ 
ber  series.  It  is  the  task  of  the  examinee  to  determine  by  what  rule  of 
progression  the  series  was  constructed,  and  then  to  fill  the  gaps  left  in  the 
progression  with  the  missing  numbers.  Since  two  numbers  are  omitted  in 
each  progression,  a  problem  contains  two  separately  scored  responses. 
Since  the  difficulty  of  a  problem  is  largely  dependent  upon  the  determina¬ 
tion  of  a  rule  of  progression,  and  not  upon  the  simple  arithmetic  involved, 
the  examinee  usually  answers  the  items  of  a  problem  as  a  pair,  correctly 
or  incorrectly. 

( 1 )  Internal  characteristics. — The  test  is  divided  into  2  parts,  each  part 
containing  19  problems  (38  scored  responses).  Part  I  also  contains  two 
unscorcd  sample  problems.  These  sample  problems  arc  reproduced  below, 
with  accompanying  text  from  the  directions. 


Sample  Problems  1  and  2: 

4  6 

10 

12 

0) 

A.  4 

A. 

(2) 

10 

a  S 

a 

14 

C  8 

c 

IS 

D.  12 

D. 

16 

E.  20 

a 

18 

The  series  above  consists  of  numbers 
Therefore,  the  answer  to  problem 

which  increase  by  twos. 

1  is  8,  and  llte  answer  to 

problem  2  is 

14. 

Sample  Problems  3  and  4: 

29  22  16  11 

7 

(3) 

(4) 

A.  0  A.  I 

a  I  B.  2 

C  3  C  3 

D.  4  P.  4 

E.  10  E.  S 

These  numbers  decrease  by  an  amount  which  each  time  is  decreased  by  one. 
That  is,  22  is  7  less  :han  29.  16  is  6  less  than  22,  11  is  5  less  than  16,  and  7  is  4 
less  than  11.  Now  continuing  the  series,  4  is  J  less  than  7,  and  2  is  2  less  than  4. 
Therefore,  the  answer  to  sample  problem  3  is  4,  and  you  should  have  blackened 
the  space  under  D  on  your  answer  sheet.  The  answer  to  problem  4  is  2,  and  you 
should  have  blackened  the  space  under  B  on  your  answer  sheet 

•Developed  at  Paychclogical  Research  Unit  No.  J.  Chief  contributor*:  La.  David  H.  JkiUm 
and  Jean  R.  Cjrona. 
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(2)  Administration. — A  brief  statement  of  the  examinee's  task  and  the 
two  sample  problems  arc  printed  on  the  cover  of  the  test  booklet.  This, 
the  explanation  and  solution  of  these  problems,  and  a  paragraph  caution¬ 
ing  the  examinees  to  avoid  sheer  guessing  constitute  the  formal  adminis¬ 
trative  directions  for  the  test.  The  two  parts  of  the  test  are  given  and 
timed  separately.  The  time  limit  for  part  I  is  14  minutes;  for  part  II,  18 
minutes.  The  difference  in  time  allotted  to  the  two  sections  allows  for  the 
increasing  difficulty  of  the  problems.  Directions  for  the  test  can  be  admin¬ 
istered  in  4  minutes,  bringing  the  total  testing  time  to  36  minutes. 

(3)  Scoring. — The  scoring  formula  is  R— W/4. 

Statistical  results. — Based,  for  the  most  part,  on  single  samples  of 
moderate  sizes,  the  data  for  this  test  arc  relatively  complete  but  not  suffi¬ 
ciently  extensive  to  be  conclusive. 

(1)  Distribution  statistics. — A  sample  of  194  classified  pilots  (class 
44-A)  yielded  a  mean  score  of  47.9,  with  a  standard  deviation  of  11.8. 
The  distribution  curve  is  negatively  skewed  and  somewhat  flatter  than 
normal. 

(2)  Internal  consistency. — The  degree  of  homogeneity  of  the  items  is 
indicated  by  a  mean  internal-consistency  phi  of  0.30,  with  a  standard 
deviation  of  0.17  and  a  range  of  values  from  0.00  to  0.84.  These  statistics 
arc  based  upon  an  analysis  of  the  responses  of  the  highest  25  percent  and 
the  lowest  25  percent  in  total  score  of  a  group  of  480  unclassified  aviation 
students,  tested  at  Psychological  Research  Unit  No.  3  in  April  1942  and 
May  1943. 

(3)  Reliability  coefficient. — By  the  alternate-forms  method,  an  esti¬ 
mated  reliability  coefficient  of  0.75,  corrected  for  length,  was  obtained. 
This  figure  is  based  on  a  sample  of  204  unclassified  aviation  students. 

(4)  Difficulty. — Based  upon  item  analysis  of  the  responses  of  the  480 
unclassified  aviation  students  mentioned  above,  the  test  yielded  a  mean 
proportion  of  correct  responses  of  0.76,  corrected  for  chance,  with  a  range 
from  0.20  to  0.99  and  a  standard  deviation  of  0.21.  For  part  I  the  mean 
is  0.80,  with  a  range  from  0.25  to  0.99  and  a  standard  deviation  of  0.20. 
For  part  II  the  mean  is  0.70,  range  0.20  to  0.99,  standard  deviation  0.24. 

(5)  Factorial  composition. — Significant  landings  appear  only  in  the 
numerical  (0.47)  and  general-reasoning  (0.36)  factors.  The  communality 
is  0.47.  For  a  full  picture  of  the  factorial  composition  of  this  test,  sec  Ap- 
jx'ndix  B. 

(6)  Test  i alidity. — Validation  results  arc  presented  in  table  7.7. 


Tari  r  7.7  —t'alidalioH  data  for  Number  Series,  CI215AXI 


Criterion 

N, 

D 

B 

Q 

SD, 

ru* 

rr»«a 

Oj«luation  elimination  . 

.  Might  mi'iion  cri4fi  . . 

.  (irountj  mi**  ion  RfidM  . 

.  \Vriphu«l  averaer 

194 

ZOO 

XX) 

zoo 

0*6 

••*.70 

42.9$ 

11.7$ 

0.27 

•1$ 

».ll 

•25 

•0.JI 

Graup 


I  primary  mining' 

iun  .... 


Pilot!  in  | 

N*  j\  i^atiun 
Njv tt'ifion  *tu  trnt** 

N a v  r c a 1 1 on  «tt»«|enM| _ _ _ _  _ _ 

•’l^’rljTuA.  Tr>l'  I  at  r»ycho!«i:iral  Krtrarck  Unit  Na.  J. 

*  A, .timing  an  unir- triclcd  ilinint  tlandard  deviation  of  2  00. 

*  IVoduct-mooKnt  rorrtlation. 

*  Same  urn  pit  u  abort. 
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Evaluation. — Only  47  percent  of  the  total  variance  of  the  test  is  ac¬ 
counted  for  by  the  common  factors  extracted  in  an  analysis  of  the  non¬ 
verbal  reasoning  battery,  to  be  described  later  in  this  chapter.  Since  the 
test  has  a  fairly  high  reliability  (0.75),  there  remains  a  substantial  amount 
of  undefined  nonerror  variance.  Future  research  should  attempt  to  define 
this  unknown  variance. 

Nineteen  percent  of  the  total  test  variance  h  attributable  to  the  numeri¬ 
cal  factor  and  13  percent  to  the  general-reasoning  factor.  Much  better 
tests  of  these  factors  exist. 

The  navigator  validity  of  the  test  is  moderate  and  is  due  primarily  to 
the  test’s  loadings  in  the  numerical  and  reasoning  factors.  The  validity  to 
be  expected  from  these  two  factors  alone  would  be  close  to  0.30,  which  is 
notably  higher  than  the  obtained  validities. 

The  pilot  validity  of  0.31  was  found  for  a  small  sample,  and,  judging 
from  the  factorial  composition  of  the  test,  is  in  considerable  error.  A  pilot 
validity  of  0.12  was  found  for  a  comparable  form  (see  below)  on  a  much 
larger  sample  of  2,115  cases.  The  weighted  average  of  these  validities  is 
0.13.  The  pilot  validity  expected  for  this  test,  based  upon  factor  estimates, 
is  0.04,  leaving  much  obtained  validity  to  he  accounted  for  by  unknown 
factor  variance.  For  this  reason  the  test  deserves  further  analytical  study. 

Reasoning  Test,  CI215A 

This  version  of  the  Number  Series  Test  differs  from  the  CI215AX1 
form  in  directions  and  in  the  number  and  specific  content  of  the  problems. 

Evidence  from  factor  analysis  indicates  that  the  numerical  and  reason¬ 
ing  factors,  which  chiefly  characterize  the  Number  Series  Test,  are  not 
related  to  pilot  success.  It  was  thought  that  a  modified  form  of  4hc  test 
might  have  a  sufficiently  high  correlation  with  the  pilot  staninc  and  a  suffi¬ 
ciently  low  one  with  the  pilot  primary  graduation-elimination  criterion  to 
justify  its  inclusion  in  the  classification  battery  with  a  negative  weight 
assigned  for  pilots. 

New  directions,  accordingly,  were  written  to  give  the  test  a  "pilot 
slant,"  the  purpose  being  to  prevent  men  with  a  strong  preference  for  pilot 
training  from  slighting  the  test 

Description. — The  cover  of  the  test  booklet,  formerly  carrying  test 
directions,  now  portrays  a  full-page  picture  of  two  United  States  pursuit 
planes  and  a  burning  enemy  craft.  The  directions,  formerly  a  terse  out¬ 
line  of  the  test-task,  were  increased  by  170  words  devoted  to  the  relation¬ 
ship  of  the  test  to  pilot  and  other  air-crew  duties.  The  number  of  test 
problems  was  reduced  to  25  (50  scored  responses).  Testing  time,  includ¬ 
ing  3  minutes  for  administration,  totals  23  minutes.  The  scoring  formula 

is  R-W/4. 

Statistical  results.— The  available  data  are  restricted  to  distribution  sta¬ 
tistics,  item  difficulty,  and  validity. 
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(1)  Distribution  statistics. — A  satnple  of  1,390  classified  pilots  (class 
44G,  tested  in  January  1944  at  Psychological  Research  Unit  No.  3) 
yielded  a  mean  score  of  21.6,  a  standard  deviation  of  5.80. 

(2)  Difficulty^ — Based  on  a  sample  of  728  classified  pilots,  the  test 
yielded  a  mean  proportion  of  correct  responses  of  0.52,  corrected  for 
chance  success,  with  a  standard  deviation  of  0.31  and  a  range  from  0.01 
to  0.96. 

(3)  Test  validity. — For  a  sample  of  2,1 15  classified  pilots  (class  44G, 
tested  at  Psychological  Research  Unit  No.  3),  using  graduation-elimina¬ 
tion  from  primary  training  as  the  criterion,  the  uncorrected  biserial  r  was 
0.10;  corrected  for  restriction  of  range,  the  validity  was  0.12.  Of  this 
sample,  89  percent  was  graduates.  The  mean  score  of  graduates  was  20.83, 
of  climinecs  19.68,  and  the  over-all  standard  deviation  was  5.60.  For  this 
same  sample,  the  correlation  with  pilot  staninc  was  0.23,  corrected  for  re¬ 
striction  of  range. 

(4)  Item  validity. — For  a  sample  of  600  graduates  and  128  eliminces 
from  primary  training  (class  44G),  the  mean  phi  coefficient  was  0.04,  the 
standard  deviation  0.06,  and  the  range  from  —0.08  to  0.18. 

Evaluation. — The  validity  of  0.12,  compared  with  the  correlation  of 
0.23  with  the  pilot  staninc,  precludes  the  use  of  this  test  as  a  suppression 
variable  for  pilot  selection.  This  test  should  be  factorially  similar  to  Num¬ 
ber  Scries,  CI215AX1. 

Logical  Sequence  (Numerical  Sequence),  CI217A 

This  test  was  developed  at  Tuskegee  Army  Air  Field  for  possible  use 
in  the  classification  of  Negro  air  crew.  It  is  in  completion  form  rather 
than  multiple-choice  form.  Initial  informal  reports  of  exceedingly  high 
validities  against  a  pilot  criterion  for  Negro  aviation  students  were  made. 
Since  multiple-choice  reasoning  tests  were  known  to  have  so  little  validity 
for  pilot  selection,  the  test  was  forwarded  by  Headquarters  AAF  Train¬ 
ing  Command  to  Psychological  Research  Unit  No.  3  for  study. 

Some  items  were  added  to  the  test,  and  with  Pattern  Sequence,  CI217B, 
it  was  administered  in  an  intcrcorrclational  study,  designed  to  reveal 
whether  utilizing  frec-rcsponse  rather  than  multiple-choice  forms  of  a 
test  changes  factorial  composition. 

Description. — The  test  is  a  typical  number-series  test,  but  it  varies  in 
form  of  presentation  from  Number  Scries,  CI215AX1,  described  above. 

(1)  Internal  characteristics. — The  number  series  arc  punctuated  in  a 
manner  that  assists  the  examinee  in  understanding  the  internal  relation¬ 
ships  of  the  digits.  For  example,  one  problem  reads  as  follows:  13-10; 
11-7;  9-4,  The  examinee  supplies  the  next  two  numbers. 

Unlike  most  printed  tests  developed  by  the  Aviation  Psychology  Pro¬ 
gram,  problems  of  the  Numerical  Sequence  test  arc  not  answered  by  the 
selection  of  one  or  more  prepared  alternatives.  In  place  of  the  standard 
IBM  answer  sheet,  a  special  blank  is  provided,  and  the  answer  to  each 
item  must  be  written  by  the  examinee. 
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The  test  is  made  up  of  l  sample  problem,  1  practice  problem  and  40 
scored  items.  The  scored  items  are  divided  equally  between  two  sepa¬ 
rately-timed  parts  of  the  test.  The  number  of  digits  in  each  problem  ranges 
from  6  through  10,  and  the  items  are  arranged  in  approximate  order  of 
increasing  difficulty. 

(2)  Administration. — Administrative  directions  for  this  test  are  short 
but  adequate.  The  test  is  explained  with  the  assistance  of  one  sample 
problem  and  one  practice  problem. 

The  total  testing  time  is  13  minutes:  directions,  3  minutes;  part  I,  S 
minutes;  part  II,  5  minutes. 

(3)  Scoring. — For  purposes  of  analysis  both  the  number  of  correct  and 
the  number  of  incorrect  responses  arc  recorded  for  this  test. 

Statistical  results.  (1)  Test  validity. — Test  validity  data  are  available 
for  Negro  trainees.  They  do  not  support  the  initial  claim  of  high  validity 
for  this  test.  For  a  group  of  468  graduates  and  217  climinccs,  using  the 
primary  graduation-elimination  criterion,  the  uncorrcctcd  biserial  r  was 
0.04.  The  mean  score  of  graduates  was  722,  of  climinccs  6.98,  and  the 
over-all  standard  deviation  was  3.45. 

(2)  Intcrcorrclations. — Some  selected  intcrcorrclations  are  shown  in 
table  7.8.  These  data  allow  a  comparison  between  two  tests  (Numerical 
Sequence,  CI217A,  and  Reasoning,  CI215A)  that  presumably  would  be 
very  similar  facio»ially,  except  for  possible  differences  attributable  to  the 
different  modes  of  presentation  ( f rec-response  v.  multiple-choice). 


Table  7  JS.— Product-moment  correlations  of  Numerical  Sequence,  C12I7A,  and 
Reasoning,  C1215A,  with  selected  tests  (N=353  unclassified  aviation  students)* 


Correlation*  with 

Tool 

Reasoning 

Numerical  Sequent* 

(R-W/4) 

R 

W 

Numerical  Oocratiuu*  (Front),  CI701B 
Numerical  Operation*  (Back J,  C17UIU 
Dial  and  Table  Rcadinf,  CP621-622A  .. 

Speed  of  Identification,  CP610A . 

Spatial  Orientation  I,  CP50IB  ........ 

Spatial  Orientation  II,  CPS0JB  . . 

Arithmetic  Reasoning,  CI206C  . . 

Reading  Cotw>rehension,'C1614H . 

j  Reasoning,  CI2I5A  . 

O.W 

..1.1 

.44 
.16 
.21 
.18 
.41 
.17 
•  •  •  • 

0.50 

.48 

.5) 

.22 

.26 

.07 

.50 

.40 

.54 

-0.20 

-.14, 

-.22 

jot 

-.02 

-.10 

—.26 

-.17 

-.14 

i  1  Tested  in  October  1914  at  Medical  and  Psychological  taammmg  Unit  No  *. 

• 

Evaluation.— The  test’s  pilot  validity  was  overestimated  in  early  re¬ 
ports.  The  validity  coefficient  of  0.04  reported  for  a  fairly  large  sample  is 
in  accord  with  expectations  for  a  numerical  and  reasoning  test. 

The  data  in  table  7.8  reveal  some  interesting  differences  between  the 
multiple-choice,  number-series  test  (Reasoning,  CI2i5A)  and  the  comple¬ 
tion  form  (Numerical  Sequence).  The  higher  correlations  of  Numerical 
Sequence  with  Numerical  Operations,  front  and  back,  and  with  Dial  and 
Table  Reading,1  leave  little  room  for  doubt  that  it  has  a  higher  loading  on 

t  Sc«  duipicr  2S  (or  *  complct*  description  of  d>«  factorial  composition  of  lkc»*  lot*. 
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the  numorical  factor  than  docs  Reasoning,  CI215A.  Whether  this  is  en¬ 
tirely  due  to  the  difference  between  multiple-choice  and  free-response 
forms  is  a  moot  question,  since  there  are  other  minor  differences  between 
the  two  tests.  This  explanation,  however,  seems  reasonable. 

The  slightly  higher  correlations  of  Numerical  Sequence  with  Speed  of 
Identification  and  Spatial  Orientation  I,T  again,  suggest  a  higher  loading 
of  the  frcc-responsc  form  on  the  perceptual-speed  factor.  The  lower  cor¬ 
relation  of  the  test  with  Spatial  Orientation  II,  however,  casts  doubt  upon 
this  conclusion. 

* 

The  higher  correlation  of  Numerical  Sequence  with  Arithmetic  Reason¬ 
ing  could  be  due  cither  to  an  increased  saturation  with  the  numerical 
factor  or  with  the  general- reasoning  factor.  The  very  slight  increase  in 
correlation  with  Reading  Comprehension,  however,  suggests  that  the  lat¬ 
ter  interpretation  is  more  likely. 

The  correlation  of  Numerical  Sequence  with  Reasoning,  CI215A,  is 
only  0.54.  Unless  the  former  test  is  quite  unreliable,  this  suggests  less 
communality  between  the  two  tests  than  should  be  expected. 

These  data,  of  course,  are  more  suggestive  than  they  are  conclusive. 


NONNUMERICAL,  NONVERBAL  REASONING  TESTS 
Decoding,  CI214AX2  • 

\ 

This  is  one  of  the  battery  of  nonverbal,  nonnumcrical  tests  of  reasoning  ; 
ability,  developed  in  the  hope  of  finding  a  reasoning  ability  that  would  be  ; 
valid  for  pilots.  It  should  be  noted  that  the  terms  nonverbal  and  nonnu-  i 
merical,  as  applied  to  this  and  other  tests  discussed  in  this  section,  do  not  j 
mean  that  words  and  numbers  do  not  enter  into  the  test.  They  do  signify  i 
that  the  test  was  constructed  to  minimize  numerical  and  verbal  variances.  ( 
Description. — The  test  requires  the  decoding  of  short  words  written  : 
in  a  code  of  signal  flags.  The  items  arc  arranged  in  groups.  In  each  group  j 
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A-  BUD 


B-  HUG 


C-  BAT 


FIGURE  7.1 

SAMPLE  PROBLEMS  OF  DECODING  TEST, 
CI2I4AX2 


*  Developed  at  P»ychologieal  Research  Unit  No.  J.  Chief  cootributori:  S/Sgt.  J.  Cordoo  Etldo, 
Jew  R.  Ljfoaa. 
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three  to  six  rows  of  individual  flag  symbols  are  presented.  Each  symbol 
represents  an  unknown  letter  of  the  alphabet.  They  arc  arranged  three  or 
four  to  the  row,  and  each  row  forms  a  commonplace  English  word  when 
decoded.  After  examining  the  position  of  repealed  flag  symbols,  it  is  pos¬ 
sible  to  deduce  the  word  that  must  correspond  to  each  of  the  symbol  lines. 
To  illustrate  the  type  of  problem  in  the  lest,  sample  problem  II,  used  in 
the  directions,  is  shown  in  figure  7*1.  The  accompanying  text  follows: 

Note  that  the  letter  b  appears  twice,  each  time  at  the  beginning  of  a  word.  Since 
the  symbol  which  appears  at  the  beginning  of  two  code  words  is  a  black  pennant 
*  •  *  this  pennant  must  represent  h,  The  letter  u  also  occurs  twice  in  these 
words,  both  times  in  the  middle  of  a  word.  Thus  the  code  symbol  for  u  is  the 
double  white  pennant  •  *  •  Since  the  first  2  symbols  of  item  6  are  those  which 
stand  for  b  and  u,  this  item  must  be  bud.  The  other  two  items  are  solved  by  noting 
that  item  S  begins  with  b  and  is  therefore  bat  and  that  item  4  contains  u  as  a 
middle  letter  and,  therefore,  must  be  hug  *  •  • 

An  example,  illustrative  of  the  higher  difficulty  levels  of  the  test,  is  the 
last  problem  in  the  test,  shown  in  figure  72. 
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rp  p  p 
nw  p 

rrpr 

pap  rp 

pp 


A  -  WING 

B  “  NAZI 

C -ZERO 

0  -  OPAL 

E  -  DIVE 

r  -  SHIP 


FIGURE  7.2 

SAMPLE  PROBLEMS  OF  DECODING  TEST,  CI2I4AX2, 
showing  a  difficult  problem 

(1)  Internal  characteristics.— There  are  11  groups  of  flag  symbols, 
yielding  64  scored  responses. 

(2)  Administration. — Because  the  task  of  the  examinee  is  relatively 
complex,  administrative  directions  for  the  test  are  long  and  detailed. 
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The  directions  consist  of  a  generalization  of  the  test-task  and  two  prac¬ 
tice  problems  with  accompanying  explanations.  The  administrator  solves 
the  problems  with  the  examinee  by  following  directions  printed  below 
each  sample  problem.  Scratch  paper  is  provided  to  all  examinees. 

(3)  Scoring. — The  scoring  formula  is  R— W/5. 

Statistical  results. — This  test  has  appeared  in  reliability,  factor -analysis, 
and  validation  studies. 

(1)  Distribution  statistics. — Typical  distribution  statistics  obtained  on 
this  test  are  shown  in  table  7.9. 


Table  7.9. —  Distribution  constants  for  Decoding  Test,  C12HAX2,  based  upon 

samples  of  classified  pilots 


N 

M 

SD 

<563 

23.0 

10.2 

>231 

26.2 

10.7 

•895 

23.4 

10.5 

1  In  clan  44 K.  TeMed  at  Psychological  Research  Unit  No.  3. 

*  In  clan  44JI.  Tested  at  Piychotogical  Re;earch  Unit  No.  3.  • 

*  In  claiaca  44E,  44F,  and  44H.  Toted  at  Psychological  Research  Unit  No.  3.  Overlap*  with 
two  previous  sample*. 

(2)  Reliability  coefficients. — As  shown  in  table  3.1,  the  administration 
of  separately  timed  halves  of  the  test  yielded  an  uncorrccted  reliability 
of  0.58  for  unclassified  aviation  students  and  of  0.64  for  unclassified  avia¬ 
tion  students  and  airplane  mechanics,  with  the  coefficient  unaffected  by  the 
time  interval  between  the  administration  of  the  two  halves.  The  corre¬ 
sponding  corrected  figures  are  0.73  and  0.78. 

(3)  Factorial  composition. — In  one  somewhat  unsatisfactory  analysis 
(of  the  November  1943  classification  battery,  see  eh.  28),  the  test  had 
loadings  of  0.32  on  the  spatial-relations  factor  and  0.31  on  the  perceptual- 
speed  factor.  The  communality  was  only  0.26.  In  this  analysis,  only  the 
perceptual,  spatial,  social-science  background,  verbal,  mechanical,  and 
mathematical-background  factors  were  defined.  A  better  conception  of  the 
factorial  composition  of  a  test  of  decoding  may  be  gained  from  the  dis¬ 
cussion  of  Decoding,  CI214AX1,  which  immediately  follows.  For  a  full 
description  of  the  factorial  composition  of  this  test,  sec  Appendix  B. 

(4)  Test  validity. — Validation  results  based  on  several  samples  are 
given  in  table  7.10. 


Tame  7.10. —  yalidity  data  for  Decoding  Test,  CI2I4AX2 


Croup 

Crittrion 

N, 

M. 

SD, 

ru. 

.•’a..’ 

r  ’  ,  in  pnmiff  ininm** 

Graduation  elimination 

211 

0.91 

26  45 

2175 

10.70 

0  13 

0.16 

*n  primary  Uaimmt1 

Gra  (u.ition  <dn*una»ion 

5M 

04 

21.00 

22.70 

10.15 

.01 

.08 

iMott  »n  pnmjry 

(iriiltnlion  elimination 

S9S 

.91 

21  40 

22  65 

10  45 

.02 

.06 

IMel*  in  pnnury  training* 

(ifidu.Mion  flnnination 

443 

87 

22  50 

19.70 

10.60 

.14 

.20 

XX) 

•20 

t  i 

Njiiiihon  ftvidrnli* . 

(•round  rutsmn  *r*dc« 

200 

.... 

•24 

o  o  a  « 

k!  .  1,1 

joo 

4  24 

1  Auunttng  an  unrritnctH  alanine  nan.ljrd  deviation  of  2.00. 

*  In  tint  44)1  Trued  at  I’aychoL'ciral  Rr*rarck  Unit  No.  3. 

*  In  dm  4 4 K.  Tr*tr>l  at  I’t^rholojical  Rf'earcS  Unit  No.  J. 

*  In  <!;  if  *  4iE.  44p,  and  44|{.  Tcittd  at  Ptychological  Rewirch  U..it  No.  3.  Oveitap#  with 
two  prrvioua  eamplrv 

*  In  dataei  4tl  and  -J.  Triled  at  Payckologital  Reaearck  Unit  No.  3. 

*  I’roduct  moment  correlation. 

1  State  aoaapU*  XI  ibera. 
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(5)  Item  validity. — Validation  of  items  revealed  a  mean  phi  of  0.00, 

1  .i^ed  upon  the  responses  of  600  graduates  and  62  eliminccs  from  primary 
training  (class  4411).  the  standard  deviation  of  phi  values  was  0.08,  and 
<he  range  was  from  —0.19  to  +0.30. 

Evaluation. — Decoding,  C1214AX2,  shows  little  pilot  validity.  The 
highest  corrected  validity  yielded  by  any  of  the  numerous  samples  studied 
is  0.20  on  a  total  of  483  cases.  The  weighted  average  for  1,529  cases  is 
0.13.  In  the  factor  analysis  of  the  battery  of  nonverbal  reasoning  tests, 
Decoding  C1214AX1,  an  earlier  form  of  the  te  t  (see  discussion  imme¬ 
diately  following),  showed  significant  loadings  in  the  following  factors: 
general  reasoning,  reasoning  II,  reasoning  III,  perceptual  speed,  and 
spatial  relations.  If  a  reasoning  factor  with  pilot  validity  exists,  it  is  not 
defined  by  this  test  in  this  analysis,  for  the  pilot  validity  shown  by  the 
test  can  be  accounted  for  by  its  variance  in  the  perceptual -speed  and 
spatial  factors. 

The  navigator  validity  (0.24  uncorrccted)  is  expected  in  view  of  the 
test’s  reasoning,  spatial,  and  perceptual  content. 

Decoding,  CI214AX1 

A  variation. — This  preliminary  form  of  the  Decoding  test  differs  some¬ 
what  from  the  final  version.  It  is.  important  primarily  because  of  factorial 
data  available  on  it. 

Description. — This  form  of  the  test  is  divided  into  2  comparable  parts 
of  45  scored  responses  each.  Directions  and  type  of  items  are  identical 
with  those  in  the  final  form  of  the  test. 

(1)  Administration. — The  over-all  testing  time  is  50  minutes;  part  1 
takes  25  minutes,  part  II,  20  minutes,  and  the  directions  require  5 
minutes. 

(2)  Scoring. — The  scoring  formula  is  R— W/4. 

Statistical  results. — No  validation  data  were  compiled  on  this  test  in 
view  of  anticipated  revisions. 

(1)  Distribution  statistics.— A  sample  of  204  unclassified  aviation  stu¬ 
dents  (tested  in  May  1943  at  Psychological  Research  Unit  No.  3)  yielded 
a  mean  score  of  28.3,  a  standard  deviation  of  10.1 . 

(2)  Internal  consistency. — The  degree  of  homogeneity  of  the  items  is 
indicated  by  a  mean  internal-consistency  phi  of  0.22.  a  standard  deviation 
of  the  phi  distribution  of  0.13,  and  a  range  of  values  from  —0.06  to 

f-0.46.  These  statistics  arc  based  upon  analysis  of  the  responses  of  the 
highest  25  percent  and  the  lowest  25  percent  in  total  score  of  the  group  of 
204  unclassified  aviation  students  mentioned  above. 

(3)  Reliability  coefficient.— By  the  alternate-forms  method,  an  esti¬ 
mated  reliability  coefficient  of  0.72,  corrected  for  length,  was  obtained- 
This  figure  is  based  on  the  sample  of  20-1  unclassified  aviation  students. 

(4)  Factorial  composition.— The  most  significant  loadings  are  in  the 
general  reasoning  (0.36),  reasoning  III  (0.37),  pcrccptual-spced  (0.36), 
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reasoning  II  (0.30),  aad  spatial-relations  (0.19)  factors.  The  communal* 
ity  is  0.54.  For  a  full  picl  re  of  the  factorial  composition  of  this  test  see 
Appendix  B. 

livaluaticn.  Fifty-four  percent  of  the  test's  total  variance  is  accounted 
for  by  tnc  common  factors  extracted  in  the  analysis  to  be  described  later 
in  this  chapter,  leaving  considerable  undefined  nonerror  variance.  Signifi¬ 
cant  percentages  of  the  total  variance  are  attributed  to  the  various  factors 
as  follows:  reasoning  III,  14  percent;  general  reasoning,  13  percent; 
perceptual,  13  percent ;  reasoning  II,  9  percent,  and  spatial  relations, 
4  percent.  The  remaining  variance,  is  spread  over  other  factors  in  negli¬ 
gible  amounts. 

The  test  is  important  because  it  helps  define  the  two  new  factors,  rea¬ 
soning  II  and  reasoning  III.  (See  factor  anal;  'is  at  end  of  this  chapter.) 

Figure  Analogies  Test,  CI212AX1  • 

This  is  a  variation  of  the  familiar  figure-analogies  test  which  has  ap¬ 
peared,  among  other  places,  in  the  American  Council  on  Education  College 
Aptitude  Test.  Generally  recognized  as  a  noaverbal  reasoning  test,  this 
form  was  developed  for  inclusion  in  the  analysis  of  the  nonverbal  reason¬ 
ing  tesi. 

Description. — The  Figure  Analogies  Test  is  designed  to  measure  the 
ability  to  formulate  correct  logical  relationships  between  sets  of  geometric 
figures.  A  test  item  presents  the  examinee  first  with  three  geometric  fig¬ 
ures  labeled  X,  Y,  and  Z,  which  set  the  problem,  and  then  with  five  alter¬ 
nate  answers  lettered  A  through  E.  Figure  Y  is  always  a  simple  variation 
of  figure  "'r.  After  ascertaining  the  relationship  between  the  first  two  fig¬ 
ures,  the  ami  nee  selects  from  five  alternatives  the  figure  that  bears  the 
same  relation  to  Z  as  Y  6  d  to  X.  Sample  problem  1,  used  in  the  directions, 
is  shown  in  the  top  panu  of  figure  7.3,  end  a  problem  from  the  body  of 
the  test  in  the  lower  panels.  The  text  for  the  sample  problem  follows : 

Your  task  is  to  find  which  one  r*  the  five  choices  at  the  right  goes  with 
figure  Z  the  same  way  figuie  Y  goes  .vith  figure  X.  Figure  X  is  a  circle;  figure  Y 
is  a  similar  circle  divided  into  4  equal  parts.  The  figure  that  goes  with  figure  Z  the 
same  way  the  divided  circle  Y,  goes  with  the  empty  circle,  X,  is  figure  A.  Of  the 
five  choices,  figure  A  is  the  only  one  which  is  divided  into  four  equal  parts.  So,  we 
can  say  figure  X  is  to  figure  Y  as  figure  Z  is  to  figure  A.  Fill  in  A  after  number  l 
on  your  answer  sheet. 

(1)  Internal  characteristics. — The  test  is  divided  into  two  separately 
timed  parts,  each  consl^ing  of  30  problems.  There  are  five  additional  un- 
seo'-cd  problems  that  are  included  in  the  test’s  administrat'. '  directions 
as  sample  and  practice  problems. 

(2)  Administration. — The  test  is  explained  to  the  examinee  with  the 
assistance  of  two  simplified  sample  problems.  He  is  then  allowed  2  min¬ 
utes  to  solve  three  slightly  more  difficult  practice  problems  and  to  correct 
any  errors  in  his  woi  Fifteen  minutes  are  allowed  for  completion  of  each 

*  Developed  tt  Psychological  Research  Unit  No.  3.  Chief  contributor:  14.  Frrnk  J.  Dudek. 
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FIGURE  7.3 

SAMPLE  PROBLEMS  OF  FIGURE  ANALOGIES, 

C12I2AX! 

part.  The  total  testing  time,  including  directions,  sample,  and  practice 
problems,  is  35  minutes. 

(3)  Scoring. — The  scoring  formula  is  R  —  VV/4. 

Statistical  results. — Relatively  complete  data  are  available  for  this  test. 

(1)  Distribution  statistics. — Typical  examples  of  distribution  statistics 
obtained  in  this  test  are  given  in  table  7.11.  The  distribution  curves  are 
slightly  negatively  skewed. 


Table  7.11. —  Distribution  constants  for  Figure  Analogies,  CI212AX1,  based  upon 

samples  of  classified  pilots 


N 

M 

SD 

l212 

94.4 

8.2 

*216 

96.1 

9.9 

*496 

3J.7 

8*6 

*  In  class  44A.  Tested  at  Psychological  Research  Unit  No.  9. 

*  In  class  4411.  Tested  at  Psychological  Research  Unit  No.  9, 

*  In  class  44C.  Tested  at  Psychological  Research  Unit  No.  9. 


(2)  Internal  consistency. — The  degree  of  homogeneity  of  the  items  is 
indicated  by  a  mean  internal-consistency  phi  of  0.32,  a  standard  deviation 
of  the  phi  distribution  of  0.12,  and  a  range  of  values  from  0.08  to  0.64. 
These  statistics  are  based  upon  analysis  of  the  responses  of  the  highest  25 
percent  and  the  lowest  25  percent  in  total  score  of  a  group  of  197  unclas¬ 
sified  aviation  students,  tested  in  March  1943  at  Psychological  Research 
Unit  No.  3. 

(3)  Reliability  coefficient.— By  the  alternate- forms  method,  an  esti¬ 
mated  reliability  coefficient  of  0.82,  corrected  for  length,  was  obtained. 
This  figure  is  based  on  a  sample  of  1,200  unclassified  aviation  students. 

(4)  Difficulty. — Based  upon  item  analysis  of  the  responses  of  197  un¬ 
classified  aviation  students,  the  test  yielded  a  mean  proportion  of  correct 
responses  of  0.58,  corrected  for  chance  success,  with  a  range  from  0.00  to 
0.96  and  a  standard  deviation  of  0.27 . 
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(5)  Factorial  cotnposilion. — The  most  significant  loadings  are  in  the 
reasoning  II  (0.40),  general-reasoning  (0.34),  integration  III  (0.34), 
reasoning  III  (0.31),  visualization  (0.28),  verbal  (0.23),  and  numeri¬ 
cal  (0.20)  factors.  The  communality  is  0.76.  For  a  full  picture  of  the  fac¬ 
torial  composition  of  this  test  see  Appendix  B. 

(6)  Test  validity. — Validation  results  based  on  several  samples  are 
given  in  table  7.12. 
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Tabu-  7.12. —  Validity  data  for  Figure  Analogies,  C1212AX1 


Group 

Criterion 

N, 

Pilot*  tn  primary 

Graduation-elimination 

Graduation-elimination 

Graduation-elimination 

Graduation-elimination 

496 

Pilots  m  primary 

712 

Pitots  in  primary 

796 

Pilots  in  primary 

634 

Navigation  students  . 
Navigation  students’  . 
Nasigition  students’  . 
Navigation  students*  . 

200 

Ground  mission* . 

20G 

200 

Graduation-elimination 

1.67S 

H 

M, 

M. 

SD, 

fMi 

/Hi 

H 

33.94 

31.09 

m 

0.16 

m 

34.66 

32.20 

.13 

.93 

34.12 

31.66 

10.86 

.11 

*0.19 

35.13 

32.19 

8.40 

.17 

*  •  c  ♦  • 

•28 

U 

pmm 

•14 

HH 

mm 

KM 

•29 

.92 

38.39 

33.82 

7.40 

.39 

*.61 

*  In  class  44C.  Tested  at  Psychological  Research  Unit  N-*>.  3. 

*  In  classes  44B  and  44C.  Tested  at  Psychological  Research  Unit  No.  3.  Overlaps  previous 
sample. 

*  In  classes  44B,  44C,  and  44D.  Tested  at  Psychological  Research  Unit  No.  3.  Partially 
overlaps  previous  sample. 

4  Assuming  an  unrestricted  stanine  standard  deviation  of  2.00. 

•In  classes  44D  and  44 E.  Tested  at  Psychological  Research  Unit  No.  3.  Partially  overlaps 
previous  sample. 

*  Product-moment  correlation. 

•Same  sample  as  above.  • 

•Test'd  at  Psychological  Research  Unit  No.  t  in  June  1944;  at  Psychological  Research  Unit 
No.  2  in  May  1944;  and  at  Psychological  Research  Unit  No.  1  in  April  1944. 


Evaluation. — The  weighted  averages  of  the  factor  loadings  of  two  fac¬ 
tor  analyses  (N=46^  ;  account  for  76  percent  of  this  test’s  total  variance. 
The  percentages  of  total  variance  accounted  for  are:  Reasoning  II,  16 
percent;  general  reasoning,  ;2  percent;  integration  Ill,  12  percent;  rea¬ 
soning  III,  10  percent;  visualization,  8  percent;  verbal,  5  percent,  and 
numerical,  4  percent.  The  remaining  variance  is  spread  over  other  factors 
in  insignificant  amounts. 

The  validity  figures  arc  similar  to  those  of  other  tests  in  the  nonverbal 
reasoning  group.  The  plot  validity  appears  to  be  the  result  of  the  com¬ 
bined  loadings  of  several  pilot-valid  factors,  including  visualization  and 
perceptual  speed.  The  very  much  higher  navigator  validity  is  to  be  ex¬ 
pected  from  loadings  in  the  general  reasoning,  verbal,  and  numerical  fac¬ 
tors.  The  test  is  also  important  because  it  best  defines  the  new  factor, 
reasoning  II  (see  below). 

Figure  Classification,  CI213AX1  10 

This  is  a  new  version  of  a  familiar  test.  It  was  developed  as  a  com¬ 
ponent  part  of  the  nonverbal  reasoning  group. 

Description. — As  stated  in  the  test’s  directions,  this  is  a  test  of  the 
ability  to  draw  comparisons  and  make  generalizations.  The  task  of  the 

»  Developed  at  Psychological  Research  Unit  No.  3.  Chief  contributor:  Lt.  Mahlon  B.  Smith. 
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examinee  is  to  select  from  five  alternatives  the  geometric  figure  that  has 
the  characteristic  common  to  each  of  three  figures  that  set  the  problem# 
Practice  problem  I,  used  in  the  directions,  is  shown  in  the  upper  pane!  of 
figure  7.4.  The  explanatory  text  accompanying  this  sample  problem  is: 

The  three  figures  to  the  left  of  the  heavy  line,  although  of  different  shapes  and 
sizes,  are  alike  in  one  way.  The  lines  which  bound  the  figure  are  straight  lines. 
Now  examine  the  five  figures  labeled  A,  B,  C,  D,  and  E.  Find  the  one  figure  which 
is  bounded  only  by  straight  lines.  The  only  figure  which  meets  this  requirement  is 
figure  D. 
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FIGURE  7.4 

SAMPLE  PROBLEMS  OF  FIGURE  CLASSIFICATION, 

CI2I3AXI 

The  typical  test  problem  requires  the  detection  ot  exact,  but  obscure, 
similarities.  At  the  more  difficult  levels,  the  key  figures  of  a  test  problem 
appear  to  bear  absolutely  jio  relationship  to  each  other  upon  initial  inspec¬ 
tion.  Their  similarities  may  exist  in  such  minor  characteristics  as  number 
of  dimensions,  type  of  shading,  number  of  divided  areas,  type  of  lines 
used  to  enclose  the  figures,  inclusion  of  certain  type  and  number  of  angles, 
etc.  An  example  of  one  of  the  more  difficult  problems  is  shown  in  the 
lower  panel  of  figure  7.4.  Figure  “D”  is  the  correct  answer  to  this  prob¬ 
lem.  It  is  the  only  alternative  possessing  the  characteristic  the  three  key 
figures  have  in  common ;  i.  e.,  formation  of  the  figure  by  use  of  one  con¬ 
tinuous  line  with  both  ends  free. 

(1)  Internal  characteristics. — The  test  is  divided  into  two  separately 
timed  parts,  each  containing  16  items.  There  are  two  practice  items  at  the 
beginning  of  the  test. 

(2)  Administration. — The  time  limits  established  for  this  test  are  as 
follows:  Directions,  1  minute;  part  I,  12  minutes;  part  II,  lO1/*  minutes; 
over-all  testing  time,  23Vi  minutes. 

(3)  Scoring. — The  scoring  formula  is  R— W/4. 

Statistical  results. — Extensive  data  arc  available  for  this  test. 

(1)  Distribution  statistics.— Typical  examples  of  distribution  statistics 
obtained  on  this  test  arc  given  in  table  7.13.  The  distribution  curves  are 
slightly  positively  skewed. 
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Table  7.13. —  Distribution  constants  for  Figure  Classification,  CI21SAX1,  based 

upon  samples  of  classified  pilots 


N 

M 

SD 

*693 

13.1 

7.0 

•9SS 

13.0 

7.6 

!  }n  clastct  441)  and  44E.  Tested  at  Psychological  Research  Unit  No,  3. 

In  classei  44 D,  44E,  and  44H.  Tested  at  Psychological  Research  Unit  No.  3.  Partially 
overlaps  previous  sample. 


(2)  Internal  consistency. — The  degree  of  homogeneity  of  the  items  is 
indicated  by  a  mean  internal-consistency  phi  of  0.49,  a  standard  deviation 
of  the  phi  distribution  of  0.18,  and  a  range  of  values  from  0.10  to  0.77. 
These  statistics  are  based  upon  analysis  of  the  responses  of  the  highest 
25  percent  and  the  lowest  25  percent  in  total  score  of  a  group  of  480  un¬ 
classified  aviation  students,  tested  in  March  1943  at  Psychological  Re¬ 
search  Unit  No.  3. 

(3)  Reliability  coefficient. — By  the  alternate-forms  method,  an  esti¬ 
mated  reliability  coefficient  of  0.78,  corrected  for  length,  was  obtained. 
This  figure  is  based  on  a  sample  of  440  unclassified  aviation  students, 
tested  in  March  1943  at  Psychological  Research  Unit  No.  3. 

(4)  Difficulty. — Based  upon  item  analysis  of  the  responses  of  450  un¬ 
classified  aviation  students  (tested  in  March  1943  at  Psychological  Re¬ 
search  Unit  No.  3),  the  test  yielded  a  mean  proportion  of  correct  re¬ 
sponses  of  0.45,  corrected  for  chance  success,  with  a  range  from  0.05  to 
0.99  and  a  standard  deviation  of  0.20. 

(5)  Factorial  composition. — The  most  significant  loadings  are  in  the 
integration  III  (0.38)  and  reasoning  III  (0.32)  factors.  It  is  important 
to  note  that  the  test  has  a  loading  of  only  0.03* on  the  general  reasoning 
factor  and  of  0.15  on  the  verbal  factor.  The  communality  is  only  0.30.  For 
a  full  picture  of  the  factorial  composition  of  this  test  see  Appendix  B. 

(6)  Test  validity. — Validation  results  based  on  several  samples  are 
given  in  table  7.14. 


Table  7.14.— Validity  data  for  Figure  Classification  CI213AXI 


Croup 

Criterion 

D 

fl 

M, 

M. 

SD, 

rH# 

•r44t* 

Pilots  In  Primary 

Graduation-elimination 

Graduation  elimination 

Gradual  ion. elimination 
Plight  mission  grades 
Ground  mission  grades 
Weighted  total  grades 

693 

262 

194 

200 

200 

200 

m 

■ 

| 

1 

-0.06 

-.OS 

.14 

•02 

•13 

•09 

-0.03 

.03 

Pilots  in  primary 

Pilots  in  primary 

Navigation  students 
Navigation  students* 
Navigation  students’j 

•  •  •  •  « 

1  Assuming  an  unrestricted  stanine  standard  deviation  of  2.00. 

1  In  clas-es  441)  and  44K.  Tested  at  Psychological  Research  Unit  No.  3. 

•  In  class  4411.  Tested  at  I’s) etiological  Research  Unit  No.  3. 

*  In  class  44A.  Tested  at  Psychological  Research  Unit  No.  3. 

•  Product-moment  correlations, 

*  Same  sample  aa  the  one  preceding. 

(7)  Item  validity. — Validation  of  items  revealed  a  mean  phi  of  0.02, 
based  upon  the  responses  of  600  graduates  and  41  eliminccs  from  training 
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(in  class  44D;  tested  at  Psychological  Research  Unit  No.  i).  ihc  stand¬ 
ard  deviation  was  0.07,  and  the  range  was  from  —0.16  to  +0.15. 

Evaluation. — Although  the  test  has  satisfactory  reliability,  its  pilot 
validity  is  zero  or  slightly  negative,  and  its  navigator  validity  is  extremely 
low,  if  not  zero. 

The  following  factors  account  for  the  indicated  percentages  of  the  test** 
total  variance :  verbal,  2  percent ;  general  reasoning,  0  percent ;  reasoning 
J 1 1,  12  percent;  reasoning  II,  3  percent;  integration  III,  15  percent.  The 
zero  loading  in  the  navigator-valid  general-reasoning  factor  is  mentioned 
to  help  interpret  the  unusually  low  navigator  validity  of  the  test.  All  factor 
loadings  are  critically  low,  with  the  possible  exception  of  reasoning  III 
and  integration  III,  These  two  factors  are  not  known  to  be  valid  for  any 
air-crew  position. 

The  common-factor  variance  represents  only  30  percent  of  the  test’s 
total  variance.  The  known  factorial  content  obviously  docs  not  present  a 
complete  picture  of  this  test.  There  is  little  or  no  pilot  or  navigator  validity 
to  be  accounted  for  by  the  unknown  variance,  so  this  test  deserves  no 
further  attention  in  aviation  psychology. 

Pattern  Sequence,  CI217B 

Like  Numerical  Sequence,  CI217A  (see  .above),  this  test  was  devel¬ 
oped  at  Tuskegee  Army  Air  Field  for  possible  use  in  the  classification  of 
Negro  aircrew ;  and  it  is  in  completion  form,  rather  than  multiple-choice 
form.  It,  too,  was  administered  in  an  intereorrelational  study  to  discover 
any  possible  effect  of  the  multiple-choice  form  upon  factorial  content 

Description. — Each  problem  in  the  test  consists  of  a  series  of  geometric 
figures,  constructed  in  accordance  with  a  rule  of  progression.  The  exam¬ 
inee  must  determine  that  rule,  and  supply  the  next  figure  in  the  scries. 
Thus,  the  test  has  some  characteristics  of  the  Number  Scries,  Figure  Gas- 
si  fication,  and  Figure  Analogies  tests. 


OOOC>  000 _ oo 
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oo 
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FIGURE  7.5 

SAMPLE  PROBLEMS  OF  PATTERN  SEQUENCE, 
CI2I7B 
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(1)  Internal  characteristics. — There  are  40  scored  items  and  1  sample 
item.  The  scored  items  are  divded  equally  between  two  separately  timed 
parts.  Sample  problems  are  shown  in  figure  7.5. 

(2)  Administration. — One  sample  and  one  practice  problem  compose 
most  of  the  test's  formal  directions.  Five  minutes  are  allowed  for  each 
part. 

(3)  Scoring. — The  number  of  correct  and  the  number  of  incorrect  an¬ 
swers  arc  recorded  for  this  test. 

Statistical  results.  (1)  Test  validity. — Based  on  a  (Negro)  sample  of 
469  graduates  and  217  diminces  from  pilot  primary  training,  the  un- 
correctcd  validity  was  0.12.  The  mean  score  of  graduates  was  10.44  and 
of  eliminecs,  9.76.  The  over-all  standard  deviation  was  3.36. 

(2)  Intcrcorrelations. — Selected  inlercorrelations  arc  shown  in  table 
7.15,  comparing  Figure  Analogies,  CI212AX1,  and  Pattern  Sequence. 


Tabi.e  7.15. —  Product-moment  correlations  of  Pattern  Sequence,  CI217B,  and 
Figure  Analogies,  C 1 212 AX  1,  icith  selected  tests  (N  =  353  unclassified  students )l 


Teat 


Figure  Analogies 


(R-W/4) 


Arithmetic  Reasoning,  CI206C  . 

Reading  Comprehension,  CI614H  . 

Numerical  Operations  (Front),  CI702B  .. 
Numerical  Operations  (Rack),  CI702B  .. 
Dial  and  Table  Reading,  CP621-622A  .... 
Figure  Analogies,  CI212AXI  . 


0.43 

.41 

.20 

.2S 

.43 


Correlations  with 


Pattern  Sequence 

R 

W 

0.4$ 

-0.2$ 

.42 

-.21 

.34 

-.15 

.37 

-.10 

.so 

-.22 

.46 

-.42 

1  Tested  in  October  1944  at  Medical  and  Psychological  Examining  Unit  No.  8. 


Evaluation. — The  similar  correlations  of  Figure  Analogies  and  Pattern 
Sequence  with  Arithmetic  Reasoning  and  Reading  Comprehension  sug¬ 
gest  that  reasoning  variance  is  not  changed  by  using  a  nonmultiple-choice 
form. 

The  correlations  with  the  Numerical  Operations  and  Dial  and  Table 
Reading  tests,  however,  suggest  increased  variance  in  the  numerical  fac¬ 
tor  of  the  free- response  form.  The  reader  will  recall  that  the  same  conclu¬ 
sion  was  draw”  in  comparing  the  Number  Series  and  the  Number 
Sequence  tests. 

As  was  true  for  that  comparison  also,  the  correlation  between  the  free- 
response  and  multiple-choice  form  is  unexpectedly  low. 

Again  it  should  be  stated  that  these  data  are  merely  suggestive,  since 
the  tests  compared  differ  in  other  respects  than  the  use  of  prepared  alter¬ 
natives. 

Spatial  Reasoning,  CI211BX1  11 

This  test  is  a  revision  of  the  Thurstone  marks  test  and  was  designed 
for  inclusion  in  the  group  of  nonverbal  reasoning  tests. 

u  Developed  at  Prvchological  Research  Unit  No.  3.  Chief  contributor*:  Lt.  Lewi*  C.  Carpenter, 
Jr.,  and  U  Una  llutchlnaoa. 
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Description. — The  test  requires  the  examinee  to  detect  the  principle 
governing  the  placement  of  letter  symbols  in  a  spatial  pattern  of  dashes 
and  gaps.  Figure  7.6  presents  sample  problem  two  of  the  test.  The  solu¬ 
tion  for  this  problem,  taken  from  the  test’s  administrative  directions, 
reads : 

The  rule  for  example  2  is,  “The  number  of  marks  to  the  left  of  X  increases  by 
one  in  each  row.  Y  is  just  to  the  right  of  the  last  gap  in  each  row."  X  and  Y  are 
omitted  from  the  last  row,  but  according  to  the  rule.  X  should  be  at  the  5th  mark 
(E)  and  Y  should  be  at  the  mark  following  the  last  gap  (I).  Blacken  spaces  E 
and  I  after  item  2  on  your  answer  sheet. 


FIGURE  7.6 

SAMPLE  PROBLEM  OF  SPATIAL  REASONING* 

Cl  21 1 BX I 

(1)  Internal  characteristics. — The  test  is  divided  into  2  parts,  each 
containing  17  scored  items.  Some  items  call  for  the  placement  of  two 
symbols,  others  of  three  symbols.  There  is  a  total  of  78  scored  responses 
in  the  test. 

(2)  Administration. — A  general  statement  of  the  task  involved,  a 
standard  paragraph  on  use  of  the  IBM  answer  sheet,  and  three  sample 
problems  make  up  the  formal  test  directions.  Total  testing  time,  including 
5  minutes  for  directions  and  administration,  is  50  minutes.  The  time 
limit  for  part  I  is  25  minutes  and  (or  part  II,  20  minutes.  Three  minutes 
before  the  end  of  each  period,  examinees  arc  informed  of  the  time  remain¬ 
ing.  A  15-placc,  IBM  answer  sheet  is  used  with  the  test. 

(3)  Scoring.— The  scoring  formula  is  R— W/5. 

Statistical  results. — Owing  to  the  early  development  of  a  revised  form, 
only  limited  data  arc  available  on  this  test. 

(1)  Distribution  statistics.— A  sample  of  224  unclassified  aviation  stu¬ 
dents  tested  at  Psychological  Research  Unit  No.  3  in  April  1942  yielded 
a  mean  score  of  47.5,  and  a  standard  deviation  of  16.1. 

(2)  Reliability  coefficient.— A  sample  of  224  unclassified  aviation  stu¬ 
dents  tested  in  March  1942  at  Psychological  Research  Unit  No.  3  yielded 
an  alternate- forms  correlation  of  0.74,  which  corrects  to  0.85. 

(3)  Factorial  composition.— The  most  significant  loadings  are  in  the 
reasoning  I  (0.45),  reasoning  III  (0.38),  integration  III  (0.38),  spatial 
(0.26),  and  verbal  (0.20)  factors.  The  communalily  is  0.72.  For  a  full 
picture  of  the  factorial  composition  of  this  test  see  Appendix  B. 
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(4)  Test  validity. — Validation  results  based  on  several  samples  are 
given  in  table  7.16. 

Table  7.16. —  Validity  data  for  Spatial  Reasoning,  C1211BX1 


Croup 

Criterion 

N, 

P  9 

M. 

SD, 

f»l» 

Pilot o  in  primary  training1  . 
Navigation  students  ....... 

Navigation  students* . 

Navigation  students*  ....... 

Craduation  elimination 

Flight  missions  . 

(•round  missions  .... 
Weighted  total . 

104 

200 

200 

200 

0.27 

42.60 

38.64 

20.60 

0.11 

*21 

*.26 

*.27 

1  Iii  clast  431.  Tested  at  Psychological  Research  Unit  No.  3. 
1  Product-moment  correlations. 

*  Same  sample  as  that  preceding. 


Evaluation. — Seventy-two  percent  of  the  total  variance  of  the  test  is 
accounted  for  by  loadings  in  common  factors  extracted  in  two  factor 
analyses.  Significant  percentages  of  variance  are  attributed  to  the  follow¬ 
ing  factors:  reasoning  I,  20  percent;  reasoning  III,  14  percent;  integra¬ 
tion  III,  14  percent;  spatial,  7  percent;  and  verbal,  4  percent.  The  remain¬ 
ing  common-factor  variance  is  spread  over  seven  other  factors. 

This  test  has  its  highest  loading  (0.45)  on  the  reasoning  I  factor.  Its 
reliability  is  satisfactory.  As  expected  for  a  general-reasoning  test,  it  is 
valid  for  navigators,  but  not  for  pilots. 

Spatial  Reasoning,  CI211BX2 

A  variation. — This  test  is  a  revision  of  Spatial  Reasoning,  CI21JBX1, 
and  differs  in  surface  characteristics  only.  Additional  data  of  value,  how¬ 
ever,  are  available  on  this  form  of  the  test. 

Description. — Parts  I  and  II  of  the  original  test  arc  combined  in  this 
form  and  the  total  number  of  scored  responses  reduced  to  70.  The  x,  y,  z 
symbols  used  to  formulate  test  problems  arc  replaced  by  numerical  digits 
corresponding  to  the  numbers  of  the  problems.  Test  directions  were  re¬ 
written  in  the  interests  of  clarity,  but  without  major  change. 

(1)  Administration. — Over-all  testing  time  was  cut  from  50  to  30  min¬ 
utes,  with  administration  time  remaining  constant. 

(2)  Scoring. — The  scoring  formula  is  R— W/5. 

Statistical  results. — Considerable  validity  and  distribution  data  were 
compiled  on  this  form. 

(1)  Distribution  statistics. — Typical  distribution  statistics  obtained  on 
this  test  are  given  in  table  7.17.  The  distribution  curves  are  somewhat 
positively  skewed  and  considerably  flatter  than  normal. 


Table  7.17. —  Distribution  constants  for  Spatial  Reasoning,  CI211BX2,  based  upon 
samples  of  pilots  in  primary  training 


>1 

SD 

*104 

37.7 

20.6 

»2i9 

46.6 

16.5 

»l  s 

23.6 

14.6 

«  In  tin*  4)1.  Tested  at  Psychological  Research  Unit  No.  J. 

•In  elm  44A.  Tested  at  Psychological  Research  Unit  No.  J. 

•  In  cttuel  44D  and  44E.  Toted  it  Psychological  Research  Unit  No.  J, 
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(2)  Test  validity. — Validation  results  oascu  on  soci*»  «... 

given  in  table  7.18. 


Tabi  e  7.18. —  Validity  data  for  Spatial  Reasoning,  CI21IBX2,  using  th t  graduation - 

elimination  criterion 


Group 

N, 

M. 

SI), 

f*4« 

S71 

0.94 

24. SO 

22.85 

14.65 

0.05 

0.09 

•686 

.94 

23.75 

22.10 

14.55 

.05 

.09 

1,291 

.93 

27.98 

19.67 

14.69 

.28 

.48 

1  Assuming  an  unrestricted  ttanine  standard  deviation  of  2.00. 

*  In  class  441).  Tested  at  Psychological  Research  Unit  No.  I. 

*  In  classes  44D  and  44E.  Tested  at  Psychological  Research  Unit  No.  3.  Overlaps  previous 
sample. 

•Tested  at  Psychological  Research  Unit  No.  1  or.  May  30  and  June  1,  1944;  at  Psycho¬ 
logical  Research  Unit  No.  2  in  May  1944;  at  Psychological  Research  Unit  No.  3  in  April  1944. 

Evaluation. — As  indicated  by  validation  data  on  several  samples  of  sub¬ 
stantial  size,  the  test  holds  little  promise  as  an  instrument  of  pilot  selec¬ 
tion.  Like  other  tests  of  the  nonverbal  reasoning  group,  the  test  has  sub¬ 
stantial  navigator  validity.  The  factor  pattern  for  this  form  of  the  test 
should  be  closely  similar  to  that  for  the  BX1  form. 

A  FACTOR  ANALYSIS  OF  REASONING  TESTS  " 

Despite  the  large  amount  of  existing  research  into  the  psychology  of 
reasoning,  but  little  attention  has  been  paid  to  the  problem  of  the  statis¬ 
tically  independent  abilities  that  enter  into  the  solutions  of  reasoning  tasks. 
The  tests  described  in  the  previous  pages  of  this  chapter  all  seem  to  in¬ 
volve  thinking  through  to  a  solution.  Some  appear  superficially  to  be  tasks 
of  deduction,  others  of  induction.  Some  require  the  examinee  to  adopt  and 
test  various  hypotheses ;  some  merely  set  a  problem  which  can  be  reasoned 
through  to  a  solution  by  the  application  of  the  rules  of  mathematics. 
These  and  many  more  assets  of  the  tests  may  be  delineated  by  armchair 
analysis.  It  is  desirable,  however,  to  establish  statistical  reference  points 
to  guide  such  introspective  analysis  and,  equally  important,  to  secure 
quantitative  evaluations  of  the  contents  of  the  various  tests. 

The  Data 

The  intcrcorrelations  for  this  study  arc  based  uj>on  a  sample  of  202 
classified  pilots  who  were  awaiting  entrance  to  preflight  school.  At  the 
time  of  testing  (spring  of  1943),  restriction  of  range  due  to  disqualifica¬ 
tion  for  low  aptitude  did  not  constitute  a  major  problem ;  so  the  intcrcor¬ 
relations  probably  arc  not  biased  in  this  sense.  The  matrix  of  intcrcorrcln- 
tions  appears  in  table  7.19. 

Included  in  the  battery  of  tests  are  those  which  were  considered,  at  that 
time,  most  promising  as  nonverbal  reasoning  tests.  These  are.  Spatial 

•i  Accomplished  »t  l*«»iho!ogit»l  _Rc»e»»ck  Unit  No.  3.  Cbitl  contnbuion:  C»p4.  Uoy4  G. 
Humphrey*  and  Lt,  David  II.  Jerkin*. 


113 


m  I  f<n  ^ C4 N  f«  •*  n  f  i  N  n  r« n  *•  N  t  «  t  v  *  n 


I  2 

it  — 

=s  " 

t  2 

•« 

C3  ^ 


r  ^4 

o  — 

3  — 

<*  - 

a  - 

-?  O 

*  — 

e _ 

a 

<  • 


a 

ta  _ 

< 

H  - 


"'•NOO'-i  —  «N  G»«  -  Koi'fvv.  NC«  •  &  f* 


CNOkOO'*''*^  OC*^,)'3',N'jr'>  V/1  •  x  > 

t •'0*4^ 


p  *»N*n'?on  Ovn^OnO'^??;  — *♦  WOtO 
C4  c*  *  *-»r<  Tf^Kir^if  «r  ^  r  i  — .  •  v*  O  <v  <r 


—  c*  —  *"*  t>.  «n  -t  ~i  -*>  t  *s.  ©  ^  •»  *-»  ^ 


0<>  »r»^^-'0»',00*ts3'^»0p*^'v< 
^«rin<<N^n’fnr>'Nn*-*N  ■  ^  *t  -r 

N^orr,  PC'^'O^'  rs  W-,  —  •  «C  ™  ^  2 '° 

- rN.00'^*Tp*oi^rx>T'O»  ■  —  •*  "1  O  r* 

_ ( _ 

»n^OO  -.  c»  *■•>  oo  x  **  « 

»“"»?N*'*>VC<©^r<C'4rs4nO*  •  e  ^  ^1  >/»  riCJ 


'0'0<Nr  O^KOn>5  ■K'ON-^OQ-N'/'r* 
•Mt'jNPjfjf  4,r.P4<s< 


«^v^n^*r»»n«»nK»C4rs*  ■  -r  '■'*—•  f'.i  <-i  *  •+  *^  *^ 


-NO^OOn-nMnN^ 


O-f  lONN'OJB  •-♦nON^O^^^OO^) 

N -r  WnO-^T  -«^-)Of  «'OON^-NOP 
r<  '4 o ^  m *> <n  -no^ 

•^0t^NN0,nv\»O0«'<- 
O  O  <N  T  <M  O  •N'nNpnnf  Nhit*  v«/>^  - 1  ^ 

w»  »n  <*  *0  *n  •  •*  r*r^  O  **  •*»  ^  ^  \5oe  cr>  O  «-•  ♦  ^ 


»«n«0 

)  t  ▼  .*ftfN«Kir4>n^^ONt  «  ^>  »■->  -r  ^4  * 


«j<n*  -«x  -  < -jO-^c 

— •  —  */~t  't«t»*rtP<ir,*ot  »*C«f'tf  r  >r.  i>*  n 


♦'J  r#  .un^^^'fnrirtONW^-NO^K^ 
r*  m  ■  ^ .  t  'n-!  J  ^  ^  O  r  -  ^  "  ^  r< 


in  ,iSMN^rst^-N'0-*'^-'C'^"KN‘-'"0 
^  - - 0~“0<N0*’'""”~r'l~'*'T  — O  *>«  —  p“-  — 

./*«  *,i»  o*1'  -  **ic  “»  t  1'  •'* 

.  ^  r<  — *  ij  q  _}  f  j  _»  O  ^ f<  ""  -*  **  n  <  *  *N  p* 


>C  <*  *  ■**  • 

-  '  **  v 
c^-  ."  = 
■?  =  *!: 

»-o  1  * 

•1  <1-7  *  * 

w  r  tvf 

V  *  i  !•  •• 


Reasoning,  Figure  Classification,  1-igure  Analogies,  Uecoumg,  inuiiiiki 
Series  Completion,  and  the  Spatial  Visualization  tests.  All  but  the  Spatial 
Visualization  tests  were  discussed  in  this  chapter.11 

In  addition  to  these,  seven  other  experimental  tests  were  included  in 
the  battery,  although  they  were  not  constructed  as  part  of  the  nonverbal 
reasoning  project.  It  was  hoped,  however,  that  they  would  clarify  the 
analysis.  These  tests  arc:  Planning  Air  Maneuvers  and  Competitive  Plan¬ 
ning,  discussed  in  chapter  9;  Instru.i.  nt  Comprehension  I  and  II,  dis¬ 
cussed  in  chapter  19;  Pattern  Comprehension,  discussed  in  chapter  12; 
Pursuit,  CP512A,  discussed  in  chapter  16;  and  Aptitude  Te*t,  part  III, 
QP901A.  The  last  test,  not  treated  in  this  volume,  is  a  (iottschaldt  figures 
test,  similar  to  the  test,  Camouflaged  Outlines,  CP821A,  discussed  in 
chapter  17. 

Fight  classification  tests  were  included  in  the  matrix  to  serve  as  refer¬ 
ence  variables.  They  are:  Speed  of  Identification  and  Spatial  Orientation 
I,  to  define  the  perceptual  speed  factor;  Technical  Vocabulary  (Navi¬ 
gator)  and  Reading  Comprehension,  to  define  the  verbal  factor;  Mathe¬ 
matics  B,  for  the  general  reasoning  factor;  Numerical  Operations,  for  the 
numerical  factor;  the  SAM  Complex  Coordination  test,  to  define  the 
spatial  relations  factor;  and  the  Mechanical  Principles  test,  to  define  a 
mechanical  factor,  but  which  defined  a  new  factor,  as  will  be  seen.  All 
these  tests  are  described  fully  in  this  volume,  with  the  exception  of  the 
Complex  Coordination  test.  This  test  is  brielly  described  on  p.  122,  and 
fully  described  in  report  no.  4  of  this  scries. 

Nine  factors  were  extracted  and  interpreted.  The  centroid  loadings  are 
shown  in  table  7.20  and  the  rotated  loadings  in  table  7.21. 


The  Factor* 

Rotated  factor  I  is  defined  by  the  following  tests  and  loadings: 


Test 

numHtr 


2 

t 

IS 

21 

* 

II 


Tt»l  ruin* 


Spatial  Orif ntatif>n  I  . 

S|K-fJ  ul  Identification  . 

Pursuit  . 

Occodinj  . 

('otnpirt  Cootduiation . 

lnttiununt  Compfrhrnrion  I 


trading 


0.6) 

.)/ 

M 

M 

.2? 

2i 


This  is  the  familiar  perceptual  "peed  factor,  which  usually  clearly 
••merges  when  either  Spatial  Orientation  1  or  Speed  of  Identification  is  in 
the-  matrix.  In  ail  analyses,  one  or  the  other  of  these  tots  Vst  defines  the 
factor,  with  the  other  lest  taking  second  place.  The  w  tghted  average 
loadings  (see  tabic  28  1 5 )  show  Speed  of  Identification  to  be  the  slightly 
better  measure  of  the  factor. 

The  substantial  loading  on  this  factor  for  the  Pursuit  Tot  indicates 
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that  performance  in  the  test  is  facilitated  by  quick  apprehension  of  the  de¬ 
tails  of  the  intersections  of  lines.  It  should  be  noted  (see  table  28.15)  that 
the  test  has  its  highest  loading  (weighted  average)  in  this  factor. 

No  reasoning  test,  except  Decoding,  has  a  loading  greater  than  0.15 
with  this  factor.  The  loading  of  Decoding  implies  a  need  for  reduction  of 
this  factor  by  utilizing  flag-symbols  that  are  distinctly  different  from  one 
another. 


Rotated  factor  II  is  defined  by  the  following  tests  and  loadings: 


Tese 

number 

Test  name 

Loading 

6 

Numerical  Operations  . . 

0.69 

.51 

.47 

11 

5 

Mathematics  B  . . 

22 

Number  Series  Completion . 

IS 

Pursuit  . , , 

This  is  the  numerical  factor,  which  is  also  clearly  defined  in  most 
matrices.  Of  the  experimental  reasoning  tests,  only  Number  Series  Com¬ 
pletion  appears  projected  on  this  factor  with  a  loading  greater  than  0.20. 
Comparing  the  loadings  of  the  Number  Series  test  on  this  and  on  the  gen¬ 
eral-reasoning  factor  (0.36),  it  can  be  seen  that  it  is,  at  least  for  the  avia¬ 
tion-student  population,  more  of  a  numerical  test  than  a  reasoning  test. 

The  substantial  loading  of  the  Pursuit  test  on  this  factor  invites  com¬ 
ment.  The  test  does  not  involve  the  numerical  operations  of  addition,  sub¬ 
traction,  multiplication,  and  division,  but  it  does  involve  the  location  and 
use  of  numbers  in  answering  the  test  items.  In  two  other  analyses  (see 
ch.  9  and  16),  Pursuit  had  loadings  of  0.19  and  0.26  on  the  numerical 
factor.  The  weighted  average  loading  is  0.25  (see  table  28.15).  Another 
test,  Organizational  Planning  (see  ch.  9),  also  has  a  nonncgligible  loading 
(0.41)  on  this  factor.  This  test,  too,  does  not  involve  numerical  opera¬ 
tions  as  much  as  it  requires  the  examinee  to  locate  and  remember  num¬ 
bers.  Apparently,  then,  the  definition  of  the  numerical  factor  must  be 
broadened  to  include  more  than  the  simple  numerical  operations. 

Rotated  factor  III  is  d'  fined  by  the  following  tests  and  loadings: 


Trtt 

number 


Test  name 


Loading 


5 

16 

20 

10 

II 

21 

22 

II 

19 

4 

II 


Mathematics  R  . 

Spatial  Reasoning  . . 

Spatial  Visualisation  II  ..... 

Competitive  manning  . 

Instrument  Comprehension  I 

Decoding  . . . . 

Number  Series  Completion  .. 

Figure  Analogies  . 

Spatial  Visualisation  I  ..... 
Reading  Comprehension  .... 
Pattern  Comprehension  ..... 
Planning  Air  Maneuver*  .... 
Mechanical  Principles . 


0.SS 

.46 

.44 

.41 

.16 

.16 

.36 

.15 

.14 

.11 

.11 

.25 

.25 


This  is  identified  as  the  general-reasoning  factor,  usually  defined  by 
Mathematics  B.  All  the  tests,  with  the  exception  of  Figure  Classification, 
that  were  considered  to  be  promising  nonverbal  reasoning  tests  contain 
this  factor,  but  with  loadings  ranging  from  only  0.34  to  0.46. 


Rotated  factor  IV  is  defined  l»y  the  following  tests  and  loadings: 


Test 

number 

Test  name 

leading 

12 

Instrument  Comprehension  II  . 

0.52 

8 

Complex  Coordination  . . 

.52 

9 

Planning  Air  Maneuvers  .  . . . . 

.41 

11 

Instrument  Comprehension  I  . . . . 

.59 

7 

Mechanical  Principles  . . . . . 

.29 

19 

Spatial  Visualization  I  . 

.24 

This  is  the  familiar  spatial-relations  factor,  usually  best  defined  by  the 
two  leading  tests  in  the  tabulation  above.  No  reasoning  test  has  an  im¬ 
portant  loading  on  this  factor ;  the  highest  loading  is  for  Spatial  Visuali¬ 
zation  I  (0.24). 


Rotated  factor  V  rs'dcfincd  by  the  following  tests  and  loadings: 
_ . _  - _ *  - 


Test 

number 

_ _  .r  - - - 

Test  name 

Loading 

19 

Soatirl  Visualization  I  . . . 

0.56 

7 

Mechanical  Principles  . . . . . . 

.54 

20 

Spatial  Visualization  II  . . . . . 

15 

Pattern  Comprehension  . . . 

.50 

14 

Gottschaldt  figures  (QP  Part  III)  . . 

.55 

This  is  the  visualization  factor.  The  factor  was  first  defined  in  this 
analysis  of  nonverbal  reasoning  tests.  The  tests  highest  on  the  factor  ap¬ 
parently  involve  the  manipulation  of  visual  imagery  Of  the  reasoning 
tests,  only  the  Spatial  Visualization  tests  have  loadings  on  this  factor 


greater  than  0.17. 

Rotated  factor  VI  is  defined  by  the  following  tests  and  loadings: 


Test 

number 

Test  name 

Loading 

0.66 

3 

.6$ 

4 

.54 

S 

This  is  the  verbal  factor.  Its  absence  in  tests  utilizing  apparently  com¬ 
plicated  verbal  directions  is  eloquent  testimony  of  the  success  of  careful 
test-construction.  No  nonverbal  reasoning  test  has  a  verbal  loading  greater 


than  0.18. 

Rotated  factor  VII  is  defined  by  the  following  tests  and  loadings: 


Test 

number 

Test  name 

lauding 

18 

14 

0.40 

.59 

.56 

Gottschaldt  Figure»t  (QP  Part  III)  . . . 

12 

.55 

11 

19 

1 

21 

.52 

.51 

.50 

.27 

4 

15 

Heading  Comprehension . .  . . * . 

.34 

This  new  factor  defies  precise  description.  No  te»t  has  a  very  high 
saturation  in  the  factor,  and  all  the  tests  on  it  are  complex.  Two  tentative 
definitions  of  this  factor  have  been  proposed.  The  first  is  that  it  is  a  visual 
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memory  factor,  identical  with  that  best  defined  by  the  Map  Memory  tests 
(sec  eh.  11),  with  cither  or  both  of  these  possibly  identical  with  the  factor 
best  defined  by  the  Plane  Formation  tests  (see  eh.  16).  But  the  presence 
of  Spatial  Visualization  II  and  Reading  Comprehension,  which  do  not  in* 
volve  pictorial  material,  is  against  this  interpretation.  The  second  hypothe¬ 
sis  is  that  it  is  another  reasoning  factor,  although  it  is  not  precisely  defin¬ 
able  yet.  Until  better  evidence  is  available,  this  factor  may  be  called 
reasoning  II. 


All  the  experimental  reasoning  tests  except  Number  Series  Completion 
appear  projected  on  this  factor,  albeit  with  moderate  loadings.  There  is 
thus  strong  indication  of  the  existence  of  a  new  reasoning  factor.  The 
moderate  loadings  in  this  factor,  however,  make  it  difficult  to  formulate 
a  definition.  One  hypothesis  was  advanced  that  all  these  tests  call  for  the 
lluent  formation  of  hypotheses.  The  presence  of  Spatial  Visualization  II 
on  the  factor,  however,  is  against  this  interpretation.  A  reasonable  possi¬ 
bility  is  that  all  the  tests  on  the  factor  involve  sequential  reasoning,  i.  e., 
whether  "A”  is  true  depends  on  whether  B,  C,  and  D  are  true.  This  in¬ 
terpretation  emphasizes  the  evaluation  of  hypotheses,  rather  than  their 
formulation.  If  this  is  correct,  however,  the  absence  of  Number  Scries  is 
difficult  to  rationalize.  A  test  is  urgently  needed  that  will  have  a  high  load¬ 
ing  on  the  factor.  When  that  test  is  constructed,  the  factor  may  be  de¬ 
fined.  Until  then,  the  factor  may  be  called  reasoning  III. 


Rotated  factor  IX  is  defined  by  the  following  tests  and  loadings: 


Te*t 

number 

Ttil  name 

Loading 

16 

Spatial  Reasoning . . . . 

O.JS 

17 

h  inure  Clarification . . . . 

•  iS 

9 

1’laittiin*  Air  Maneuver*  . . . . . 

JJ 

10 

Cnuifietitive  Hanning  . 

.31 

IS 

Kii;ure  Analogic* . . . 

,2t 

This  factor  seems  to  be  integration  III  (see  ch.  10),  and  perhaps  it 
is  clearly  described  by  the  phrase  “taking  into  account.”  In  all  the  tests 
on  this  factor,  the  examinee  is  required  to  select  one  of  many  courses  of 
action.  I  Its  selection  of  the  correct  one  depends  upon  his  ability  to  take 
into  account  all  the  aspects  of  the  given  situation. 


Conclusion* 

This  analysis  yields  no  conclusive  insight  into  reasoning  tests.  The  re 
suits  are  a  challenge  to  future  investigators. 


l'hc  main  facts  and  interpretations  concerning  the  experimental  rea¬ 
soning  tests  may  be  enumerated  as  follows: 

(1)  Reasoning  III  is  probably  a  true  reasoning  factor,  since  all  but 
one  of  the  experimental  reasoning  tests  have  moderate  loadings  on  the 
factor.  More  research  is  needed,  however,  before  a  dear  definition  of  the 
ability  can  be  formulated. 

(2)  That  reasoning  II  is  truly  another  reasoning  ability  is  more  dubi¬ 
ous.  Again,  only  future  research  can  establish  the  facts. 

(3)  The  experimental  reasoning  tests  are  all  faclorially  complex,  and 
they  do  not  have  very  high  loadings  on  any  factor.  Commonly,  the  tests 
include  variance  of  the  general-reasoning,  reasoning  II,  reasoning  III, 
and  integration  III  factors.  All  but  one  of  the  experimental  nonverbal 
reasoning  tests  appear  to  contain  the  familiar  general-reasoning  factor, 
with  loadings  ranging  from  0.34  to  0.46. 

(4)  Nonverbal-reasoning  tests  typically  are  free  of  even  moderate 
saturations  with  perceptual,  numerical,  spatial,  visualization,  and  verbal 
factors,  although  several  tests  have  important  loadings  with  one  or  an¬ 
other  of  these  factors.  The  loading  of  Decoding  on  the  perceptual  factor 
points  to  a  defect  in  test  construction,  which  can  be  easily  rectified  in 
future  work  with  the  test.  The  loading  of  the  Number  Series  test  in  the 
numerical  factor  is  probably  unavoidable.  There  would  seem  to  be  little 
promise  of  purifying  the  test  to  increase  reasoning  content  at  the  expense 
of  numerical.  The  visualization  content  of  the  Spatial  Visualization  tests 
over-shadows  their  reasoning  content.  As  a  matter  of  fact,  in  later  work 
with  one  of  these  tests  (sec  ch.12)  an  attempt  was  made  to  increase 
visualization  and  decrease  reasoning  content. 

(5)  Comparing  the  communalities  with  the  estimates  of  reliability 
given  in  this  chapter,  it  may  be  seen  that  for  every  experimental  reasoning 
test  there  is  considerable  undefined  nonerror  variance.  For  each  test,  the 
approximate  percentages  of  such  unknown  variance  arc :  24  percent  for 
Spatial  Reasoning ;  48  percent  for  Figure  Classification ;  22  percent  for 
Figure  Analogies;  14  percent  for  Spatial  Visualization  I;  12  percent  for 
Spatial  Visualization  II ;  28  percent  for  Number  Series  Completion ;  and 
19  percent  for  Decoding.  The  weighted  average  communalities  for  alt 
analyses  (see  table  28.15)  do  not  yield  significant  enlargements  of  com- 
munalitics  as  found  in  this  one  analysis,  except  in  the  cases  of  Spatial 
Reasoning  and  Figure  Analogies. 

The  analysis  sheds  light  on  several  other  problems.  In  the  first  place, 
the  visualization  factor  was  first  defined  in  this  analysis.  Secondly,  the  fact 
that  the  Pursuit  test  (in  this  and  other  analyses)  appears  on  the  numeri¬ 
cal  factor  forces  a  broadening  of  the  definition  of  the  factor  to  include 
noneomputational  facility  with  numbers.  Locating,  observing,  and  remem¬ 
bering  numbers,  as  well  as  adding,  subtracting,  dividing,  and  multiplying, 
are  apparently  involved.  Finally,  the  appearance  of  the  Pursuit  test  on  the 
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perceptual-speed  factor  indicates  that  even  apparently  relatively  unim¬ 
portant  detail  discrimination  in  a  test  will  introduce  perceptual  variance. 

EVALUATION  OF  KKASOMNG  TESTS 

✓ 

W  hile  arithmetic-reasoning  tests  are  quite  valid  against  the  navigator 
criterion,  they  are  unsatisfactory  in  that  they  are  factorially  complex.  No 
reasoning  test  was  found  that  is  valid  for  pilot  trainees. 

It  is  noteworthy  that  no  tests  have  high  loadings  either  on  the  familiar 
general-reasoning  factor  or  on  the  two  new  factors,  reasoning  II  and 
reasoning  III.  It  is  also  noteworthy  that,  typically,  there  remain  consider¬ 
able  amounts  of  unknown  nonerror  variance.  Future  research  should  de¬ 
fine  the  new  factors  and  account  for  the  unknown  variance.  The  area  of 
reasoning  tests  is  still  largely  unexplored. 

Conspicuous  by  their  absence  are  mentions  of  the  concepts  of  deductive 
and  inductive  reasoning.  These  logical  rubrics  seem  not  to  yield  valid 
descriptions  of  psychological  factors. 

The  Complex  Coordination  Test 

The  Complex  Coordination  Test,  code  number  CM701A,  a  psychomotor 
test,  is  mentioned  in  this  chapter  and  in  several  later  ones,  so  a  very  brief 
description  of  it  is  in  order  here.  It  is  a  serial,  choice-reaction-time  test 
in  which  each  stimulus  is  one  of  13  spatial  patterns  of  3  lights  each.  In 
systematic  correspondence  with  each  stimulus  pattern,  the  correct  re¬ 
sponse  is  a  unique  adjustment  of  imitation  stick-and-rudder  controls.  Each 
correct  reaction  automatically  brings  a  new  stimulus.  The  score  is  the 
nunilier  of  reactions  completed  in  8  minutes. 
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CHAPTER  EI6HT. 


Judgment  Tests1 


INTRODUCTION 


Judgment  in  Aviation 

Judgment  has  been  one  of  the  major  areas  of  research  in  aviation  psy¬ 
chology.  An  important  reason  for  this  was  the  common  practice  of  flight 
instructors  to  place  errors  of  perception,  visualization,  and  reasoning  all 
under  the  broad  category  of  “poor  judgment.” 

Poor  judgment  is  one  of  the  most  frequently  mentioned  reasons  given 
by  instructors  for  eliminating  cadets  from  flying  training.  It  was  men¬ 
tioned  in  50  percent  of  the  cases  in  rn  analysis  of  the  reasons  stated  in 
faculty-board  proceedings  for  eliminating  1,000  aviation  cadets  from  ele¬ 
mentary  pilot  training  in  the  latter  part  of  1941.  Some  typical  comments 
classified  under  the  category  of  poor  judgment  arc:  "dangerous  judgment 
in  traffic”;  "unable  to  make  sound  decisions  in  traffic  or  in  the  vicinity  of 
other  planes”;  “choice  of  fields  and  judgment  in  simulated  forced  land¬ 
ings  has  been  weak”;  "unable  to  exercise  safe  judgment  in  the  air”;  and 
“fails  to  discriminate  safe  from  unsafe  flying.” 

In  an  attempt  to  clarify  the  concept  of  judgment,  aviation  psychologists 
asked  flight  instructors  to  define  what  they  meant  by  judgment.  Some 
typical  definitions  constructed  from  the  comments  of  flight  instructors  are: 

1.  The  ability  to  react  immediately  and  appropriately  to  stimuli  with 
which  an  individual  is  unacquainted. 

2.  “Hcadwork”  or  the  ability  to  react  correctly  without  deliberation, 
or  the  ability  to  fly  without  confusion  in  traffic  and  under  unusual  cir¬ 
cumstances. 

3.  Knowledge,  plus  speed  of  reaction,  plus  freedom  from  emotional 
confusion. 

4.  The  ability  to  react  appropriately  in  a  surprise  situation. 

5.  Ability  to  grasp  the  situation  as  a  whole,  not  being  absorbed  with 
minor  details. 

Previous  Studies  of  Pilot  Judgment 

During  World  War  I  tests  of  judgment  of  distance,  speed,  and  time 
were  used.  These  included  estimation  of  length  of  sticks,  of  the  relative 
speeds  of  four  revolving  disks,  of  the  time  required  for  sand  to  flow  from 
one  container  to  another,  and  of  the  curves  and  relative  speed  of  two 
white  spots  moving  along  converging  lines  in  a  horizontal  plane  (1). 

»  Writ"*  by  SuS/Sft.  Benjamin  Fnxkur. 
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In  1940,  Kelly  had  flight  instructors  rate  HO  civilian  pilot  training 
students  on  a  1  -I -item  graphical  scale  of  pilot  competency  (3),  Thirteen 
of  the  items  were  intercorrelated  and  analyzed  by  Thurstonc’s  centroid 
method.  Three  factors  were  found  necessary  to  account  for  the  intercor- 
rclations  and  were  identified  as  (1)  skill,  (2)  judgment,  and  (3)  emo¬ 
tional  control.  The  items  that  had  significant  loadings  on  the  judgment 
factor  are  as  follows : 

How  good  is  his  judgment  with  regard  to  taking  flying  risks  (weather, 
stunting,  etc.)  ? 

Does  he  show  respect  for  a  ship  and  its  motor? 

How  well  is  lie  satisfied  with  his  flying  ability? 

Is  he  inclined  to  show  off  while  flving  a  plane? 

How  carefully  does  he  check  his  plane  and  engine  before  takhig  off? 

Judgment  in  the  AAF  Qualifying  Examination,  AC10A 

The  first  form  of  the  AAF  Qualifying  Examination,  AC10A  (sec  Re¬ 
port  No.  6),  introduced  in  January  1942,  included  a  number  of  judg¬ 
ment-type  items  like  the  following: 

A  pilot  has  to  make  a  forced  landing  near  a  mountain  cabin.  He  finds  that  the 
nearest  phone  is  at  an  isolated  fire  ranger’s  cabin  14  miles  across  the  mountains 
to  the  north.  It  U  winter.  He  sets  out  on  foot  for  the  ranger’s  cabin  at  6  im, 
carrying  food  for  only  one  meal  At  10  a.m.,  having  met  no  one,  he  comes  to  three 
branches  cf  the  trail,  all  unmarked.  His  most  practical  decision  would  be  to: 

A.  Follow  the  trail  which  appears  to  lead  in  the  right  direction  until  he  reaches 
the  cabin  or  the  end  oi  the  trait. 

B.  Turn  back  immediately  to  his  starling  point 

C.  Leave  the  trail  and  go  due  north  by  compass. 

D.  Walk  along  the  trail  which  appears  to  lead  in  the  right  direction  until  noon, 
then  ',um  back  if  not  sure  of  his  location. 

E.  Stay  in  the  fork  in  the  trail  and  wait  tor  someone  *o  come  by. 

The  judgment  subtest  (15  items)  of  AC10A  had  a  reliability  of  0.36, 
based  on  370  pilot  students.  It  had  a  biserial  correlation  of  0.36  with  the 
criterion  of  graduation-elimination  from  primary  pilot  training,  based  on 
545  cases  in  class  42G.  These  facts  seemed  to  substantiate  the  hypothesis 
that  practical  j?  Igmcnt  was  a  measurable  psychological  category  and  to 
make  desirable  iurther  analysis  of  the  problem. 

Research  on  Judgment  Tests  for  Classification 

Research  on  judgment  attempted  (o)  to  analyze  judgment  tests  and  ( b ) 
to  analyze  the  concepts  of  practical  judgment  as  described  by  instructors. 
Both  lints  of  attack  were  fruitful,  but  at  the  war’s  end  neither  had  been 
exhausted,  by  any  means. 

In  the  re;»  of  this  chapter  the  complex  nature  of  the  tests  is  demon¬ 
strated,  and  their  unique  contribution — a  judgment  factor — is  revealed. 
The  role  of  background  information  in  both  judgment  tests  and  the  pilot 
criterion  was  fairly  wei.  verifier!,  and  certain  types  of  information  (for 
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example,  mechanical  information)  were  found  to  be  contributory  to  sue* 
ccs$  loth  in  test  performance  and  in  primary  pilot  training. 

The  relations  of  judgment  to  reasoning  and  perceptual  abilities  are 
pointed  out  in  this  and  in  other  chapters.  The  hypothesis  of  a  functional 
thought-fluency  factor  is  mentioned,  but  conclusions  cannot  be  reached 
owing  to  the  lack  of  final  results. 

PRACTICAL  JUDGMENT  TESTS 

Practical  Judgment,  CI301BX1  * 

The  hypothesis  was  formed  that  a  major  part  of  the  variance  of  typical 
judgment-test  items  is  attributable  to  individual  differences  in  pertinent 
informational  background.  This  grew  out  of  a  conviction  that  the  judg¬ 
ment  items  in  AC10A  have  a  very  high  mcchanical-knowlcdgc  component 
and  that  this  accounts  in  large  part  for  the  validity  of  the  test.  The  gen¬ 
eral  hypothesis  can  be  tested  by  a  study  of  the  mechanical-information 
tyj>e  of  item  alone.  Such  items  were  expected  to  correlate  with  Mechanical 
Principles,  0903 B,  and  Mechanical  Information,  CI905B,  tests  whose 
mechanical  content  is  known. 

Description.  (1)  Internal  characteristics. — Test  CI301BX1  contains  80 
items  and  covers  a  large  range  of  situations  which  arc  considered  solvable 
by  ordinary  judgment.  Twenty-eight  items  arc  considered  to  be  dependent 
primarily  upon  a  knowledge  of  mechanics,  and  52  items  arc  considered 
to  be  essentially  non-mcchanical.  The  items  arc  segregated  into  two  parts, 
each  composed  of  random  halves  of  the  mechanical  and  nonmcchanical 
items.  The  following  are  examples  of  a  nonmcchanical  and  a  mechanical 
judgment  item  respectively: 

An  officer  must  send  an  important  confidential  message  about  4  miles  through 
enemy  lines  into  an  area  which  is  very  closely  guarded,  it  is  important  that  the 
message  not  fall  into  enemy  hands,  but  it  is  equally  important  that  the  message 
get  through.  Under  the  circumstances  it  would  be  best  for  him  to: 

A.  Write  out  the  message  and  g>ve  it  to  a  runner  with  orders  to  get  through 
as  quickly  as  possible. 

B.  Send  one  runner  with  a  decoy  message  with  instructions  to  get  through  as 
soon  as  possible 

C.  Send  one  runner  with  the  written  message  -r_!  another  to  act  as  his  guard. 

D.  Write  out  duplicate  forms  of  the  message  and  give  tKm  to  two  runners 
with  instructions  to  get  through  as  soon  as  possible 

E.  Have  two  runners  memorize  the  message  and  instruct  them  to  get  through 
as  soon  as  possible 

You  are  operating  a  large  water-cool-d  motor  wrth  a  heavy  load,  when  you  notice 
that  a  bearing  is  heating  excessively.  Tt  would  be  best  for  you  to: 

A.  Stop  the  motor  immediately  ard  lubricate  the  bearing. 

B.  Remove  the  load,  lubricate  ‘.he  bearing  freely,  with  the  motor  running 

slowly. 

C  Run  the  motor  slowly  with  tfu  load 

D.  Continue  to  operate  the  motor  at  present  speed  »*>d  lubricate  the  bearing. 

E.  Stop  the  motor  immediately  and  add  cold  water. 

'Tserriapnt  in  ike  Oltkt  *f  ifcr  Air  Suinm.  H»»tJt>w*H»r».  Ar»y  Air  F*«r«,  Fijrch*- 
Utfxal  Rtirirdi  Unit  S*  l 
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(2)  Administration. — Directions  are  printed  in  the  booklet,  and  the  test 
is  largely  self-administering.  The  time  allowed  for  part  I  is  40  minutes 
and  for  part  II,  40  minutes.  It  has  been  standard  practice  to  allow  1  min¬ 
ute  per  item  for  this  type  of  test 

(3)  Scoring. — The  scoring  formula  is  R— W/4. 

Statistical  results. — Some  statistics  were  computed  separately  for  the 
28  mechanical  items  and  the  52  nonmcchanical  items,  and  some  for  all 
items  together.  All  the  data  reported  arc  for  examinees  tested  in  December 
1942  at  Psychological  Research  Unit  No.  3. 

(1)  Distribution  statistics. — Typical  examples  of  distribution  statistics 
arc  given  in  table  8.1. 


Table  8.1. — Distribution  constants  for  Practical  Judgment,  CI301BX1,  based  upon 
a  sample  of  202  unclassified  aviation  students 


Score 

M 

SO 

14.8 

4.4 

57 

25.6 

(2)  Internal  consistency. — Analysis  of  responses  of  sample  groups 
yielded  the  internal-consistency  data  given  in  table  8.2. 


Table  8.2. — Internal-consistency  data  for  Practical  Judgment,  CI301BX1,  based  upon 
highest  and  louest  27  percentages  of  a  sample  of  202  unclassified  aviation  students 


Criterion 

Item* 

M# 

SD* 

| E3 

0.31 

.25 

0.11 

.11 

0.02 

.04 

0.49 

.46 

Non-nicchar.ical  icorc  ... 

(3)  Reliability  coefficient. — By  the  alternate-forms  method  (part  I  vs. 
part  II),  an  estimated  reliability  coefficient  of  0.62,  corrected  for  length, 
was  obtained.  This  figure  is  based  on  a  sample  of  202  unclassified  aviation 
students. 

(4)  Difficulty. — Based  upon  item  analysis  of  the  responses  of  500  un¬ 
classified  aviation  students,  the  test  yielded  a  mean  proportion  of  correct 
responses  of  0.53,  corrected  for  chance,  with  a  range  of  0.00  to  0.91  and 
a  standard  deviation  of  0.24. 

(5)  Factorial  composition. — The  most  significant  loadings  for  the  me¬ 
chanical-items  score  arc  in  the  mechanical-experience  (0.54),  judgment 
(0.36),  and  visualization  (U.29)  factors;  and  for  the  nonmechanical 
score,  in  the  judgment  (0.39),  planning  (0.36),  and  visualization  (0.30) 
factors.  The  mechanical-experience  loading  in  the  nonmcchanical  items 
is  only  0.13,  which  is  in  line  with  the  hypothesis  that  some  items  contain 
mechanical-information  content  and  some  do  not;  and,  that  by  design  and 
selection  the  two  tyjK's  can  lie  fairly  well  segregated.  The  communalities 
fot  the  two  tyjKS  of  items  are  0  59  and  0.48  respectively.  For  a  fuller  pic¬ 
ture  of  the  factorial  composition  of  this  test  see  appendix  B. 
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(6)  Test  validity. — Validation  results  arc  given  in  table  8-3. 

Table  8.3. — Validity  data  for  Practical  Judgment ,  CI301BXI,  based  upon 
graduation— elimination  from  primary  training  of  a  sample  of  267  pilots 


Seer* 

M. 

SD, 

Mechanical  judgment  . . . 

15.15 

25.76 

14.90 

25.30 

J.S7 

5.44 

Non-mechanical  judgment  . 

(7)  Item  validity. — Validation  of  items  of  both  mechanical  judgment 
and  nonmechanical  judgment  items  combined  revealed  a  mean  phi  of  0.00, 
based  upon  the  responses  of  200  graduates  and  64  eliniinces  from  training. 
The  standard  deviation  of  phi  values  is  0.20  and  the  range  is  from  —0.30 
to  0.18. 

Evaluation. — An  examination  of  the  factor  loadings  of  the  two  types  of 
judgment  items  supports  the  hypothesis  that  a  large  part  of  the  variance 
of  the  mechanical- judgment  items  is  accounted  for  by  individual  differ¬ 
ences  in  mechanical  information.  The  two  types  of  items  have  approxi¬ 
mately  equal  weights  on  the  visualisation  factor  and  also  on  a  factor  iden¬ 
tified  as  judgment.  In  cither  case,  'wever,  the  variance  in  judgment  is 
only  approximately  15  percent.  Assuming  that  this  factor  is  weighted  0.10 
for  the  pilot  criterion,  for  which  there  if.  some  evidence,  we  should  ex¬ 
pect  a  test  of  this  type  to  add  somewhat  to  the  validity  of  a  pilot  battery 
of  which  it  is  a  part. 

From  the  factorial  composition  of  the  two  tests  we  should  expect  (sec 
chapter  28)  a  validity  of  0.28  for  the  mechanical  items  (to  be  compared 
with  the  0.36  for  the  judgment  test  in  AC10A)  and  0.17  for  the  non- 
mechanical  items.  The  pilot-validity  figures  given  in  table  8.3  are  by  no 
means  typical  of  obtained  validities  for  these  kinds  of  items.  Weighted 
averages  of  a  number  of  estimates  of  validities  derived  from  similar  forms 
are  0.18  and  0.13,  respectively.  From  these  results  we  can  be  fairly  satis¬ 
fied  that  all  factors  with  pilot  validity  are  known  in  these  types  of  judg¬ 
ment  tests.  From  a  comparison  of  reliabilities  and  communalitics  we  can 
conclude  that  all  common  factors,  valid  and  invalid,  arc  accounted  (or. 

Practical  Judgment,  CI30IBX2 

This  test  is  a  40-iiem  revision  of  030113X1.  The  items  of  the  previous 
form  having  mechanical  content  arc  eliminated.  This  was  done  on  the 
basis  of  a  factor  analysis  of  the  previous  form  in  a  matrix  containing 
selected  tests  from  the  classification  battery.  This  analysis  indicates  that 
nonmcchanical  judgment  items,  in  addition  to  mechanical  and  intellectual 
loadings,  define  a  new  factor  tentatively  characterized  as  a  judgment  fac¬ 
tor.  The  revision  was  made  in  an  attempt  to  reduce  the  mechanical  load¬ 
ings  and  increase  the  loading  on  the  new  factor.  /V 
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Description.  (1)  Internal  characteristics. — A  sample  item  follows: 

% 

The  principal  reason  why  barracks  in  Army  camps  are  built  according  to  an 
identical  plan  is  that  this  method: 

A.  Requires  the  least  use  of  construction  materials. 

B.  Makes  possible  the  greatest  speed  of  construction. 

C.  Allows  construction  of  barracks  that  will  last  a  long  time. 

D.  Requires  a  low  cost  of  up-kccp  of  the  barracks  after  completion. 

E.  Results  in  a  military  appearance  which  is  similar  in  all  Army  camps. 

(2)  Administration. — The  two  parts  are  timed  separately,  with  an 
allowance  of  20  minutes  per  part. 

(3)  Scoring. — The  scoring  formula  is  R— W/4. 

Statistical  results. — The  data  reported  below  are  for  unclassified  avia¬ 
tion  students  tested  in  February  1943  at  Psychological  Research  Unit 
No.  3.  Those  who  entered  primary  pilot  training  were  in  class  43K. 

( 1 )  Distribution  statistics. — Typical  examples  of  distribution  statistics 
arc  given  in  table  8.4. 


Table  8.4. — Distribution  constants  for  Practical  Judgment,  C1301BX2 


Group 

N 

:« 

SD 

S71 

11.7 

4.2 

Classified  pilots  . . . . . . . 

4*8 

10.9 

1 

4.4 

(2)  Internal  consistency. — The  degree  cf  homogeneity  of  the  items  of 
the  test  is  indicated  by  a  mean  internal-consistency  phi  of  0.25,  a  standard 
deviation  of  the  ohi  distribution  of  0.10,  and  a  range  of  values  from 
—0.11  to  0.48.  These  statistics  are  based  upon  analysis  of  the  responses 
of  the  highest  2 7  percent  and  the  lowest  27  percent  in  total  score  of  a 
group  of  360  unclassified  aviation  students. 

(3)  Reliability  coefficient. — By  the  alternate-forms  method,  an  esti¬ 
mated  reliability  coefficient  of  0.43,  corrected  for  length,  was  obtained. 
This  figure  is  based  or.  a  sample  of  485  classified  pilots. 

(4)  Difficulty. — Based  upon  item  analysis  of  the  responses  of  480  un¬ 
classified  aviation  students,  the  test  yielded  a  mean  proportion  of  correct 
responses  of  0.33,  corrected  for  chance,  with  a  range  from  0.00  to  0.88 
and  a  standard  deviation  of  0.20. 

(5)  Test  validity. —A  sample  of  571  pilots  yielded  a  biscrial  correla¬ 
tion  of  —0.02,  corrected  for  restriction  of  range,  between  performance  in 
this  test  and  the  graduation-elimination  criterion  in  primary  training.  The 
mean  score  for  graduates  was  11.70,  for  eliminees  11.86,  and  the  standard 
deviation  for  both  combined  was  4.20  Of  this  sample  <5  percent  were 
graduates,  and  the  standard  deviation  assumed  for  the  unrestricted  pilot 
staninc  distribution  was  2.00. 

(6)  Item  validity. — Validation  of  items  revealed  ~  n.ean  phi  of  0.03, 
based  upon  the  responses  of  500  graduates  and  lOO  eliminees  from  pri¬ 
mary  training.  The  standard  deviation  of  phi  \alues  is  0.07,  and  the  range 
is  from  —0.11  to  0,15. 
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Reasoning  content  of  Practical  Judgment,  CI301BX2. — The  hypothe¬ 
sis  was  adopted  tliat  the  examinees  achieved  correct  solutions  to  many  of 
the  items  in  this  form  by  reasoning.  To  test  this  hypothesis,  it  was  decided 
to  do  an  item  analysis  using  the  same  360  cases  that  were  used  in  the 
internal-consistency  item  analysis,  basing  the  phi  coefficients  this  time  on 
the  criterion  of  Arithmetic  Reasoning  scores,  which  best  defines  the  gen¬ 
eral  reasoning  factor.  The  product-moment  correlation  coefficient  between 
the  internal-consistency  phis  and  those  based  upon  the  arithmetic-reason¬ 
ing  criterion  is  0.56.  This  is  considered  corroboration  of  the  hypothesis 
that  Practical  Judgment  Test,  CI301BX2,  contains  considerable  reasoning 
content. 

Evaluation. — This  attempt  to  enhance  measurement  of  the  judgment 
factor  found  in  the  factorial  analysis  of  the  Foresight  and  Planning  I 
battery  (see  eh.  9)  was  considered  unsuccessful,  inasmuch  as  tlic  items 
are  largely  solvable  by  a  type  of  reasoning  already  measured  by  the 
Arithmetic  Reasoning  Test,  CI206C.  This  factor  has  zero  validity  for 
pilot  selection. 

Practical  Judgment,  C1301BX3  * 

This  test  is  a  revision  of  Practical  Judgment,  CI301BX2.  It  represents 
another  attempt  to  clarify  the  nature  of  the  judgment  factor. 

Description.  (1)  Internal  characteristics.— Pari  I  contains  the  25  items 
from  the  previous  form  having  internal-consistency  phis  of  0.25  or 
higher.  Part  II  contains  25  newly -constructed  items  of  the  work-planning 
type  which  seemed  most  likely  to  measure  the  new  factor. 

The  following  is  a  sample  of  the  type  of  item  in  part  II : 

At  a  mobile  army  encampment,  it  is  necessary  to  use  buckets  to  fight  a  fire  100 
feet  away  from  a  stream  and  up  a  hill  Fifteen  5-gallon  buckets  and  60  men  are 
available.  The  best  procedure  would  be  to: 

A.  Select  15  men  and  have  each  run  between  the  stream  and  the  fire  carrying 
buckets,  and  replace  the  entire  crew  at  intervals. 

B.  Line  up  the  60  men  from  the  building  to  the  stream  and  pass  the  buckets 
from  man  to  man  from  the  stream  to  the  building,  the  last  man  throwing 
the  water  on  the  fire  and  returning  with  the  empty  bucket 

C  Make  one  line  of  25  men  to  pass  the  buckets  from  the  stream  to  the  build¬ 
ing  with  one  line  of  10  men  to  throw  the  empty  buckets  back :  2  men  to  di* 
2  men  to  throw  water  on  the  fire,  and  the  rest  for  relief. 

D.  Detail  2  men  to  dip  and  2  to  throw  water  on  l!.“  fire;  assign  26  to  carry 
buckets  from  the  stream  to  the  building  and  back;  replace  the  entire  crew 
with  30  fresh  men  in  IS  minutes. 

E.  Detail  S  men  to  fill  the  buckets  at  the  stream  and  leave  them  at  the  bank, 
where  they  can  be  picked  up  by  the  rest  of  the  men  who  will  carry  the* 
to  the  building  and  return  with  empty  buckets. 

(2)  Administration.— The  time  is  25  minutes  each  for  Parts  I  and  II. 

(3)  Scoring. — The  scoring  formula  for  this  test  is  R—W/4. 

a  Dtrclnpct  st  Paychalofieat  Rtwirtk  Unit  Na.  1.  Cklef  w»tnkH«n:  lx  LevU  G.  Cwftilir, 
Jr,  and  lx  Frank  /.  Dudcfc. 
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Statistical  results. — All  results  reported  below  are  for  examinees  tested 
at  Psychological  Research  Unit  No.  3. 

(1)  Distribution  statistics. — A  sample  of  242  unclassified  aviation  stu¬ 
dents  in  May  1943  yielded  a  mean  score  of  17.2,  a  standard  deviation  of 
5.3,  and  a  range  from  3  to  34. 

(2)  Internal  consistency. — The  degree  of  homogeneity  of  the  items  is 
indicated  by  a  mean  internal-consistency  phi  of  0.28,  a  standard  deviation 
of  the  phi  distribution  of  0.12,  and  a  range  of  values  from  —0.05  to  0.46. 
These  statistics  are  based  upon  analysis  of  the  responses  of  the  highest  27 
|>ercent  and  the  lowest  27  percent  in  total  score  of  a  group  of  480  unclas¬ 
sified  aviation  students  tested  in  July  1944. 

(3)  Reliability  coefficient. — Estimates  of  reliability  computed  for  the 
two  part-scores  for  a  sample  of  pilots  are  given  in  table  8.5. 


Table  8.5. — Estimated  reliability  coefficients  by  the  odd-even  method  for  Practical 
Judgment,  CI301BV3,  based  upon  a  sample  of  167  classified  pilots 


Scort 

^.I 

'a 

part  I . . . 1 . 

0.17 

0.54 

part  It . . . 

.26 

.41 

(4)  Difficulty. — In  an  item  analysis  of  the  responses  of  480  unclassified 
aviation  students  tested  in  July  1944,  the  test  yielded  a  mean  proportion 
of  correct  responses  of  0.32,  corrected  for  chance,  with  a  range  from 
0.00  to  0.78,  and  a  standard  deviation  of  0.22. 

(5)  Factorial  composition. — Part  I  and  part  II  of  the  test  were  treated 
separately  in  analyzing  this  test.  The  most  significant  loadings  for  part  I 
arc  in  the  judgment  (0.30),  verbal  (0.30),  visualization  (0.29),  and  plan¬ 
ning  (0.28)  factors.  The  most  significant  loadings  for  part  II  are  in  the 
judgment  (0.45),  general  reasoning  (0.40),  and  mechanical-experience 
(0.32)  factors.  Part  I  has  a  loading  of  0.03  in  the  general-reasoning  fac¬ 
tor,  and  part  II  has  a  loading  of  0.03  in  the  verbal  factor  and  only  0.08 
in  the  planning  factor.  The  communalities  are  0.49  and  0.51  for  parts  I 
and  II,  respectively.  For  a  fuller  picture  of  the  factorial  composition  of 
this  test  see  appendix  B. 

(6)  Item  validity. — Validation  of  items  revealed  a  mean  phi  of  0.01, 
based  upon  the  responses  of  200  graduates  and  64  eliminccs  from  primary 
pilot  training  (class  43J).  The  standard  deviation  of  phi  values  is  0.07, 
and  the  range  is  from  —0.28  to  0.19. 

Evaluation. — The  newly-constructed  items  in  part  II  of  this  test  have 
a  loading  of  0.45  on  the  judgment  factor,  compared  with  a  loading  of 
0.30  for  the  items  in  part  I,  which  were  taken  from  previous  forms  of  the 
test.  It  may  therefore  be  concluded  that  items  of  the  work -planning  type 
best  define  the  judgment  factor.  Reliabilities  of  scores  continue  to  be  low, 
though  commuuaJity  of  items  cannot  be  doubted.  The  existence  of  a  judg¬ 
ment  factor  seems  to  be  fairly  well  verified,  but  its  exact  nature  needs 


further  cls.rification.  The  type  of  test  in  which  it  can  be  very  strong  and 
unique  is  yet  to  be  found.  It  is  present,  however,  in  tests  that  are  not 
designated  as  judgment  tests,  so  it  is  possible  that  some  new  type  of  test 
can  be  designed  to  improve  its  measurement. 

Practical  Judgment,  CI301C  * 

This  form  of  the  *est  was  compiled  from  previous  forms  for  inclusion 
in  the  September  1944  classification  battery.  The  items  were  selected  on 
the  following  bases:  1.  validity;  2.  internal  consistency;  3.  full  coverage 
of  held;  4.  nonmechanical  content ;  5.  difficulty  level. 

A  sample  item  is: 

A  man  on  a  very  urgent  mission  during  a  battle  finds  he  must  cross  a  stream 
about  40  feet  wide.  A  blizzard  has  been  blowing  and  the  stream  has  frozen  over. 
However,  because  of  the  snow;  he  does  not  know  how  thick  the  ice  is.  He  sees 
two  planks  about  10  feet  long  near  the  point  where  he  wish  to  cross.  He  also  knows 
where  there  is  a  bridge  about  2  miles  downstream.  Under  the  circumstances  he 
should: 

A  Walk  to  the  bridge  and  cross  it 

B.  Run  rapidly  across  on  the  ice. 

C  Break  a  hole  in  the  ice  near  the  edge  of  the  stream  to  see  how  deep  the 
stream  is. 

D.  Cross  with  the  aid  of  the  planks,  pushing  one  ahead  of  the  other  and  walk* 
ing  on  them. 

EL  Creep  slowly  across  on  the  ice. 

(1)  Administration. — Thiity  minutes  are  allowed  for  the  30  items. 

(2)  Scoring. — The  scoring  formula  is  2R  —  2VV/3. 

Statistical  results. — Only  distribution  statistics  were  available  at  the 
time  this  was  written.  For  2,917  unclassified  aviation  students  tested  at 
Medical  and  Psychological  Examining  Units,  the  mean  score  was  20.0, 
and  the  standard  deviation  8.6. 


Practical  Judgment,  CI301DX1  * 

The  items  in  this  form  were  constructed  to  provide  a  reserve  pool  of 
validated  judgment  items  for  future  revisions  of  test  CI301C. 


PRACTICAL  ESTIMATION  TESTS 


These  tests  were  constructed  primarily  for  the  purpose  of  analyzing  the 
informational  background  of  judgment  tests.  It  was  noted  that  numerous 
items  in  the  Practical  Judgment  tests  require  the  examinee  to  make  esti¬ 
mates  of  the  weights  of  certain  objects,  of  the  amounts  of  time  necessary 
to  carry  out  certain  tasks,  etc.  It  was  deemed  desirable  to  discover  how  im¬ 
portant  this  content  was  in  defining  the  judgment  factor.  If  this  ability 
should  prove  to  be  the  unique  component  of  judgment  tests,  it  would  be 
possible  to  construct  purer  tests  of  judgment 


*  Developed  at  Parcbolofical  Rwarch  Unit  No.  J.  ..  t.  _ 

•  Developed  at  Pirdelotitil  Re*earck  Unit  No.  2.  Cknf  contributor!.'  CpL  Robert  E.  f  ink  art, 
leanna  U  Liftman.  Sit  Robert  ».  Porttr. 
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Practical  Estimations,  CI308AX1  * 

Description.  ( ) )  Internal  characteristics. — The  86  items  of  this  test  are 
divided  into  4  no’t*.  Parts  I  and  II  call  for  relative  judgments;  parts  III 
and  IV,  for  absolute  judgments.  Part  I  contains  items  in  each  of  which 
the  amounts  of  time  required  in  five  different  situations  are  compared.  In 
part  II  five  distances  or  sizes  of  five  different  objects  are  compared  in 
each  item.  In  part  III  questions  are  asked  about  the  amount  of  time  re¬ 
quired  for  a  particular  activity,  and  the  answer  is  selected  from  a  scale  of 
IS  steps.  In  part  IV  questions  about  the  length  and  sizes  of  objects  are 
asked,  and  the  answers  are  selected  from  a  scale  of  15  steps.  Two  repre¬ 
sentative  problems  are: 

Which  of  the  following  could  be  done  in  the  shortest  time? 

A.  Walking  S  miles  on  snowshoes. 

B.  Riding  a  horse  18  miles. 

C  Swimming  2  miles. 

D.  Rowing  a  boat  across  a  lake  2J4  miles  wide. 

E.  Walking  6  miles  on  flat  terrain. 

Of  the  following,  which  is  the  shortest? 

A.  Six  building  bricks  laid  end  to  end. 

B.  The  length  of  wire  used  in  making  a  wire  coat  hanger 

G  The  distance  from  the  floor  to  the  door  knob. 

D.  The  width  of  the  average  door. 

E.  Four  sheets  of  typing  paper  laid  end  to  end. 

•  (2)  Administration. — The  time  limit  for  part  I  is  12  minutes;  for  part  j 
II,  9  minutes ;  for  part  III,  15  minutes ;  and,  for  part  IV,  1 1  minutes.  j 

(3)  Scoring. — The  scoring  formula  for  parts  I  and  II  is  R— W/4;  for 
parts  III  and  IV,  R-W/5. 

Statistical  results. — The  data  reported  below  are  for  examinees  tested  1 
in  April  and  May  1943  at  Psychological  Research  Unit  No.  3.  I 

( 1 )  Distribution  statistics. — A  sample  of  237  unclassified  aviation  stu-  j 
dents  yielded  a  mean  total  score  of  29.8,  a  standard  deviation  of  7.1,  and  , 
a  range  from  1 i  to  47. 

(2)  Internal  consistency. — The  degree  of  homogeneity  of  the  items  is 
indicated  by  a  mean  internal-consistency  phi  of  0.13,  a  standard  deviation 
of  the  phi  distribution  of  0.11,  and  a  range  of  values  from  —0.01  to  0.39. 
These  statistics  arc  based  upon  analysis  of  the  responses  of  the  highest  27 


Table  8.6.—  Reliability  coefficients  for  fart  scores  of  Practical  Estimations  Test, 
Cl 308 A,  computed  by  the  odd-even  method,  based  upon  a  sample  of  183 
unclassified  aviation  students 


Scot* 
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P*ft  I  . . . a . . . . . 

0.27 

0.41 

Part  If  ,  ,tt . . . ♦  . 

-.01 

-.os 

Part  iff  . . . . . . . 

M 

.11 

Part  IV  . . . . 

M 

.43 
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percent  and  the  lowest  27  percent  in  total  score  of  a  group  of  /.SO  r  *Ja5> 
sified  aviation  students. 

(3)  Reliability  coefficient. — Estimates  of  reliability  computed  for  each 
part  of  the  test  are  given  in  table  8.6. 

(4)  Difficulty. — Based  upon  item  analysis  of  the  responses  of  "SO  un¬ 
classified  aviation  students,  the  test  (all  four  parts  combined)  yielded  a 
mean  proportion  of  correct  responses  of  0.34,  corrected  for  chance,  with 
a  range  from  0.02  to  0.71  and  a  standard  deviation  of  0.16. 

(5)  Factorial  composition. — The  scores  for  part  I  and  part  II  were 
treated  separately  for  factor-analysis  purposes.  The  most  significant  load¬ 
ings  for  part  I  are  in  the  planning  (0.36),  judgment  (0.36),  and  mechan¬ 
ical-experience  (0.33)  factors.  The  most  significant  loadings  for  part  II 
are  in  the  mechanical-experience  (0.32),  planning  (0.31),  and  numerical 
(0.28)  factors.  The  numerical  loading  in  part  I  is  0.13,  and  the  judgment 
loading  in  part  II  is  0.02.  Tlte  communalities  are  0.39  and  0.35,  respec¬ 
tively.  The  first  one  is  very  close  to  the  estimate  of  reliability,  and  the 
second  one  dearly  shows  that  the  reliability  of  part  I!  was  grossly  under¬ 
estimated.  For  a  fuller  picture  of  the  factorial  composition  of  this  test 
see  appendix  B. 

Items  calling  for  absolute  judgments  had  low  communality  with  esti¬ 
mation  items  calling  for  relative  judgments,  with  Practical  Judgment, 
CI3G1BX1,  and  with  selected  experimental  and  classification  tests.  They 
were,  therefore,  not  included  in  the  matrix  for  factor  analysis. 

Evaluation. — Items  calling  for  relative  judgments,  though  having  low 
reliability,  gave  indication  of  having  substantial  communality  with  the 
experimental  battery  with  which  they  were  administered.  It  is  interesting 
to  note  that  of  the  two  parts  of  the  Practical  Estimations  Test,  CI308AX1, 
that  were  analyzed  with  the  Foresight  and  Hanning  II  battery  (see  ch.  9), 
the  relatively  complicated  judgments  of  part  1  (involving  estimations  of 
distance  and  tinje)  have  a  significant  loading  (0.36)  on  the  judgment  fac¬ 
tor,  whereas  the  relatively  simple  judgments  of  part  II  (involving  judg¬ 
ments  of  distance  only)  have  an  insignificant  loading  (0.02). 

The  pilot  validities  to  be  expected  from  the  factor  compositions  are 
0.15  and  0.14  for  parts  I  and  II,  respectively  (see  table  28.18).  The  aver¬ 
age  validities  found  for  the  comparable  form,  CI308BX1  (see  ensuing 
discussion),  arc  0.14  and  0.13,  which  check  very  closely.  These  data  and 
the  comparison  of  communalities  and  reliabilities  indicate  that  all  common 
factors,  valid  or  invalid,  are  accounted  for. 

Practical  Estimations,  C1308BX1  * 

This  test  is  a  revision  of  Practical  Estimations,  CI308AXI.  The  items 
in  the  previous  form  calling  for  absolute,  rather  than  relative,  judgments 
had  so  little  communality  that  they  were  dropped  from  this  form.  This 

•DirtbiHil  hwlnliiitil  Itwml  U»H  N*.  1.  OIH  CH  *«Wrt  K.  f  iikirt, 

r*«.  )MM  A.  Wtwtr. 
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form  incorporates  the  items  from  the  previous  form  calling  for  relative 
judgments,  plus  ncwly-constructed  items  of  the  same  nature. 

Description.  (1)  Internal  characteristics. — Part  I  contains  judgments 
involving  time  and  distances.  Part  II  contains  judgments  involving  dis- 
tances. 

The  following  items  illustrate  the  types  in  parts  I  and  II  respectively : 

Which  of  the  following  travels  fastest? 

A.  A  batted  baseball  as  it  leaves  ihe  bat 

B.  A  polo  ball  as  it  leaves  the  mallet 

C  An  arrow  as  it  leaves  the  bow. 

D.  A  tennis  ball  just  after  the  serve; 

E.  A  prizefighter’s  fist  in  the  middle  of  the  swing. 

Which  of  the  following  is  most  nearly  the  same  as  the  distance  from  one  side 
of  the  car  windshield  to  the  other  side? 

A.  The  distance  oi  a  car  doorhandle  above  the  ground. 

B.  The  distance  of  a  doorknob  above  the  house  floor. 

C  The  lenfMi  of  an  unfolded  newspaper. 

D.  The  width  of  an  average  door. 

E.  The  width  of  an  ordinary  desk. 

(2)  Administration. — The  two  parts  are  limed  separately,  23  minutes 
being  allowed  for  the  35  items  in  each  part 

(3)  Scoring. — The  scoring  formula  is  R— W/4.  * 

Statistical  results. — The  data  reported  below  are  for  examinees  tested 
at  Psychological  Research  Unit  No.  3  in  June  and  July  1944. 

(1)  Distribution  statistics. — Typical  examples  of  distribution  statistics 
are  given  in  table  8.7. 


T Ante  8  7.— Distribution  constants  for  Practical  Estimations,  CI30SBX1 ,  based  upon 
a  sample  of  750  unclassified  aviation  students 
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Part  II . . . . . 
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(2)  Internal  consistency. — The  degree  of  homogeneity  of  the  itenu  is 
indicated  by  a  mean  internal-consistency  phi  of  0.13,  a  standard  deviation 
of  the  phi  distribution  of  0.10,  and  a  range  of  values  from  —0.05  to  0.39. 
These  statistics  are  based  upon  the  responses  of  the  highest  27  percent  and 
the  lowest  27  percent  in  total  score  of  a  group  of  750  unclassified  aviation 
students. 

(3)  Difficulty. — Based  upon  item  analysis  of  the  responses  of  400  un¬ 
classified  avia*  dents,  the  test  yielded  a  mean  proportion  of  correct 
responses  of  v\  rrected  for  chance,  with  a  range  from  0.00  to  0.75 
and  a  standard  deviation  of  0.16. 

(4)  Test  validity. — Validation  data  are  shown  in  table  8SL 


Table  8.8 .—Validity  data  for  Practical  Estimations,  C130SDX1,  based  upon  samples 
of  pilots ,  using  the  graduation-elimination  criterion 
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Evaluation. — The  moderate  pilot  validity  of  this  test  indicates  that  the 
judgment  factor  has  low  to  moderate  validity  for  the  prcdictionof  success 
in  pilot  training,  after  validities  due  to  mcchanical-cxpcricnce  and  plan* 
ning  factors  are  taken  into  account 


Evaluation  of  Practical  Estimation  Tests 
An  attempt  was  made  to  discover  whether  tests  calling  for  estimations 
of  time  and  distance  would  best  define  the  judgment  factor.  It  was  con* 
eluded  that  items  that  call  for  complicated,  relative  estimations  have  a 
significant  loading  in  the  judgment  factor.  Practical  estimation  tests  arc 
not,  however,  a  pure  measure  of  the  judgment  factor.  In  addition  to  their 
loadings  with  the  judgment  factor,  they  also  have  significant  loadings  with 
the  planning  and  mechanical  experience  factors.  Moreover,  they  were 
found  to  be  less  heavily  weighted  with  the  judgment  factor  than  are  prac¬ 
tical-judgment  items  of  the  work-planning  type. 


FLUENCY  TESTS 


Another  aspect  of  the  concept  of  practical  judgment  tliat  was  consid¬ 
ered  worth  investigating  is  the  ability  to  call  to  mind  experiences  and 
hypotheses  that  are  of  aid  in  solving  a  practical  problem.  It  is  supposed 
that  the  ease  of  evocation  of  pertinent  facts  and  hypotheses  is  an  im¬ 
portant  element  in  solving  problems  which  do  not  lend  themselves  to  con¬ 
ventional  solutions. 

Thurstone  (4)  in  his  factor  analysis  of  57  tests  identified  two  verbal 
factors.  The  first  was  defined  by  tests  in  which  the  subject  deals  with 
ideas  and  meanings  of  words.  He  called  this  factor  “verbal  relations."  The 
second  was  defined  by  tests  in  which  the  subject  recalls  single  and  isolated 
words.  He  called  this  factor  "fluency  in  dealing  with  words.” 

It  was  hypothesized  that  there  is  a  more  general  fluency  factor,  not  con¬ 
fined  to  the  recalling  of  words,  which  facilitates  thinking  toward  the  solu¬ 
tion  of  problems.  It  was  intended  to  construct  a  number  of  tests  of  fluency 
and  to  submit  the  hypothesis  of  one  general  fluency  factor  to  the  test  of 
factor  analysis. 
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The  measurement  of  fluency  requires  the  use  of  unusual  problem  situ¬ 
ations  to  which  the  examinee  is  required  to  respond  with  the  enumeration 
of  as  many  responses  as  occur  to  him  in  a  given  amount  of  time.  On  the 
one  hand,  the  multiple-choice  type  of  item  would  sccrr  to  defeat  one’s 
purpose,  since  providing  the  examinee  with  ready  responses  makes  recall 
on  his  part  unnecessary.  At  the  same  time,  the  standard  multiple-choice 
item  seems  to  be  indicated  in  order  to  avoid  such  irrelevant  factors  as  ver¬ 
bosity  and  speed  of  writing.  Machine  scoring  is  also  almost  i  compulsion 
iu  air-crew  testing.  The  need  for  satisfying  both  demands — freedom  of 
response  and  machine  scoring — presented  many  difficulties. 

As  a  group,  fluency  tests  were  developed  late  in  the  war.  Some  were 
completed  in  time  for  administration  for  intcrcorrclation  studies,  but  none 
in  time  for  validation. 

Verbal  Recognition,  CI322A  • 

This  test  is  based  upon  the  assumption  that  an  examinee’s  ease  in  evok¬ 
ing  solutions  to  practical  problems  is  related  to  the  case  with  which  he  can 
unscramble  words  that  belong  to  a  named  category. 

Description.  ( 1 )  Internal  characteristics. — The  approach  employed  is  to 
name  a  category  and  then  present  10  "scrambled  words.”  The  categories 
are  (1)  animals,  (2)  building  materials,  (3)  spores,  (4)  men’s  clothing, 
and  (3)  means  of  transportation.  The  first  letter  of  each  scrambled  word 
is  capitalized,  regardless  of  where  it  appears  in  the  scrambled  word.  It  is 
believed  that  by  capitalizing  the  first  letter  of  each  word  the  examinee  is 
forced  to  think  of  various  alternatives  in  the  given  category  starting  with 
the  capitalized  letter  rather  than  the  more  trial-and-error  task  of  unscram¬ 
bling  each  word.  Perceptually  difficult  items  were  chosen  with  the  expec¬ 
tation  that  the  examinee  would  prefer  to  resort  to  verbal  fluency  to  solve 
them.  The  words  in  part  I  arc  scrambled  haphazardly.  The  same  five  cate¬ 
gories  appear  in  part  II,  but  the  scrambled  words  are  presented  with  all 
the  consonants  arranged  alphabetically,  followed  by  the  vowels  presented 
in  the  same  way. 

(2)  Administration. — The  following  arc  the  directions  for  the  test: 

This  is  a  test  of  your  ability  to  recognize  the  names  of  things  quickly  and  ac¬ 
curately  when  the  letters  in  the  names  have  been  mixed  up.  Two  things  will  assist 
you  in  figuring  out  what  the  scrambled  words  are:  > 

1.  The  first  letter  of  each  word  will  be  capitalized,  regardless  of  where  it 

appears  in  the  scrambled  word. 

2.  The  10  words  on  a  page  are  all  names  of  the  same  kind  of  things,  such  as 

colors  in  the  sample  problems  below. 

Colors 

1.  deR. 

1  nageOr. 

The  answer  to  problem  1  is  Red.  The  answer  to  problem  2  is  Orange. 

When  the  test  begins,  write  the  words  you  are  able  to  urscramble  easily  on  the 
separate  work  sheet  provided.  Be  sure  to  write  each  word  after  its  corresponding 

■DutbfH  at  Pi7«h*Ufie»l  lutink  Unit  Ha.  2.  Ckitf  cmtntotar:  Sft.  David  Cmwn 
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number  and  make  the  words  readable.  Misspelling  of  words  will  not  be  counted 
against  you,  however. 

This  is  a  speed  test.  If  you  cannot  unscramble  a  word  quickly,  go  on  to  the  next 

You  must  work  rapidly,  since  you  will  be  allowed  only  IJ4  minutes  to  unscramble 
the  10  words  in  each  group.  Do  not  go  on  to  the  next  group  until  the  signal  is  given. 

When  the  test  is  completed,  the  test  booklet  is  collected,  but  the  work 
sheet  is  retained  by  the  examinee.  A  specially  prepared  IBM  answer 
sheet  is  given  him  together  with  the  following  oral  instructions : 

The  names  of  the  five  groups  that  appeared  in  the  test  booklet  appear  at  the 
top  of  this  answer  sheet.  On  your  work  sheet  the  answer  to  item  1  should  be  Lion. 
Now  look  under  the  first  column  on  the  answer  sheet  headed  Animals  and  find  Lion. 
It  is  the  fifth  word  from  the  top.  Blacken  in  the  first  space  to  the  left  under  Lion. 
Do  this  for  each  of  the  answers  that  you  have  on  your  work  sheet. 

The  time  limits  are  6 ]4  minutes  for  part  I  and  6 minutes  for  part  II. 

(3)  Scoring. — The  score  is  the  number  of  correct  responses. 

Verbal  Recognition,  CI322B  * 

Variations  of  the  test. — It  is  similar  to  form  A  except  that  the  examinee 
must  consider  ail  five  categories  simultaneously  rather  than  just  one  at  a 
time,  as  in  the  previous  form.  At  the  top  of  each  page  of  the  test  booklet 
the  five  categories  appear  opposite  the  letters  A  through  E.  The  examinee 
indicates  the  category  to  which  the  unscrambled  word  belongs  by  blacken¬ 
ing  the  appropriate  space  on  the  IBM  answer  sheet.  This  obviates  the 
need  for  using  a  work-sheet  and  later  transcribing  the  answers  to  a  spe¬ 
cially  prepared  IBM  answer  sheet  for  machine  scoring.  This  form  of  the 
test  is  believed  to  be  more  difficult,  because  there  are  five  reference  cate¬ 
gories  for  each  response. 

Similarities  Test,  CI319A  w 

This  is  a  lest  of  the  ability  to  recall  quickly  previously  acquired  informa¬ 
tion  abcut  common  objects. 

Description.  (1)  Internal  characteristics. — Pairs  of  common  objects  are 
listed  in  a  workbook,  each  followed  by  15  spaces  in  which  the  examinee 
is  directed  to  list  ways  in  which  2  objects  arc  alike.  An  attempt  was  made 
to  minimize  the  verbal  factor  by  ( i )  limiting  the  number  of  words  per 
response  and  (2)  employing  relatively  simple  material. 

(2)  Administration. — The  following  arc  the  directions: 

Thi*  is  a  test  to  see  how  quickly  you  can  think  of  ways  in  which  different  objects 
are  alike.  In  this  booklet  20  pairs  of  objects  are  presented,  10  pairs  in  part  I  and 
10  pairs  in  part  II.  Under  each  pair  are  lettered  spaeet  '*  which  you  will  write  down 
as  many  ways  as  you  can  think  of  in  which  the  object*  -re  alike. 

Now  look  at  the  sample  below.  Several  similarities  have  been  listed  in  the  space* 
provided  to  show  you  how  to  enter  your  answers.  A  sample  problem  follow*: 

•D*rrlopMl  at  r<rtkoiofwal  Rtacartfc  Unit  Su.  t  Chirl  taarnWtm:  5*1.  D»»U  &WW, 

^Dtnkftj  Rtvirtt  Unit  H*.  Z.  CW*(  t»lnW« :  Trck/S*t_  W  C 

Dari*. 
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Sample  problem: 

Apple  and  Orange  are  alike: 

A.  sweet 

B.  round 

C.  colored 

D.  have  seeds 

E.  fruit 

I1',  have.  skins 

G.  grow  on  trees 

h7 .  . 

i. . .  . 

x . 

k: . 

c . 

M.  . . 

N. * . .  . * .  . . . ~ 

67 . 

Notice  that  the  similarities  listed  concern  real  characteristics  of  the  objects,  such 
as  structure,  use,  or  operation.  Such  statements  as  "bought  in  stores”  and  “cost 
money,"  which  do  not  describe  the  objects  are  not  acceptable  as  answers.  Also  note 
that  “both"  is  assumed  and  need  not  be  written  down.  As  indicated  by  the  above 
sample,  you  may  use  not  more  than  three  words  in  describing  any  similarity. 

The  items  are  tir  ed  separately,  1  minute  per  item.  There  are  10  items 
iti  part  I  and  10  items  in  part  II.  When  both  parts  of  the  test  have  been 
completed,  the  examinee  is  given  the  following  oral  directions  for  record¬ 
ing  his  answers  on  a  15-place  IBM  answer  sheet : 

We  will  now  record  the  answers  on  the  answer  sheet.  Look  at  the  front  of 
the  test  booklet.  The  sample  item,  number  26,  has  answers  listed  in  all  the  spaces  up 
to  and  including  H.  Find  number  26  at  the  top  of  the  right-hand  column  on  the 
answer  sheet  and  draw  a  solid  line  through  all  the  spaces  from  A  through  H 
opposite  it.  (Illustrate  on  board.)  Do  this  now.  Now  begin  with  item  number  1 
and  record  your  answers  in  this  manner  in  the  proper  spaces  on  your  answer  sheet. 
Work  as  rapidly  as  you  can. 

(3)  Scoring. — The  score  is  simply  the  number  of  responses,  one  unit  of 
credit  being  allowed  for  each  similarity  written  down.  This  assumes  that 
quality  of  responses  is  irrelevant.  It  is  intended  to  test  this  assumption  be¬ 
fore  using  this  numerical  index  in  factor  analysis. 

Word  Association  Test,  CI318A  11 

This  test  measures  two  assumed  aspects  of  fluency:  (1)  rapidity  of  as¬ 
sociation  and  (2)  case  of  change  of  set.  Since  there  were  no  prospects  for 
its  administration,  the  test  was  not  printed.  Its  description,  however,  may 
be  of  interest. 

Description.  (1)  Administration. — The  following  arc  from  the  direc¬ 
tions  for  the  test : 

u  Developed  at  Psychological  Research  Unil  No.  2.  Chief  contributor*:  Stafl/Sgt.  Arthur  Z. 
Cerf,  Lt.  Cecil  H.  Patter*on. 
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This  is  a  test  of  your  ability  to  recognize  associations  between  words.  A  key 
word  will  be  given  which  may  have  several  meanings  and  may  be  associated  with 
one  or  more  of  five  words  opposite  it.  Your  task  is  to  select  the  words  that  seem 
to  have  similar  meaning  or  are  closely  associated  with  the  key  word,  and  to  blacken 
the  corresponding  spaces  after  the  item  numbers. 

Look  at  Sample  Problem  1: 

Sample  problem  l 

Key  word  A  B  C  D  E 

Order  neat  tried  command  purchase  single 

Alternates  A,  C,  and  D  are  correct  A— neat  means  orderly;  C — command,  to 
give  an  order;  D— purchase,  to  order  material.  Blacken  spaces  A,  C,  and  D  after 
number  1  on  your  answer  sheet 
In  taking  the  test,  here  are  the  things  to  remember: 

1.  There  may  be  1,  2,  3,  1,  or  5  correct  responses  to  each  item. 

2.  Correct  answers  are  those  which  have  the  same  meaning  or  are  closely 

associated  with  the  key  word.  Some  slang  expressions,  which  may  be 
correct  will  be  included. 

3.  Words  which  sound  the  same  as  the  key  word  but  have  a  different  meaning 

are  considered  wrong  responses. 

The  time  limits  for  the  test  are:  part  I,  5!/2  minutes,  part  II,  5l/t 
minutes. 

(2)  Scoring. — The  scoring  formula  for  the  test  is  R— W/2. 
Camouflaged  Words,  CI323A  11 

This  test  is  a  modification  of  Thurstonc's  Mutilated  Words.  Two  fac¬ 
tors  are  thought  to  be  measured  by  this  test:  (1)  case  of  evocation  of 
hypotheses  (fluency)  and  (2)  changeability  of  set 

Description. — The  items  were  pretested,  and  an  attempt  was  made  to 
secure  items  of  approximately  0.5  difficulty  with  2  seconds  permitted  per 
response.  Each  mutilated  word  has  two  items  based  upon  it. 


FIGURE  8.1 

SAMPLE  ITEM  OF  CAMOUFLAGED  WORDS,  ’ 
CI323A 


»  Developed  at  P»jrcbolo*ical 
Capt.  John  I.  Laccr. 


Retearch  Unit  No.  2.  Chief  contributor*:  Florence  R.  Gcwaamaa, 
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(1)  Internal  characteristics.— The  test  is  divided  into  2  comparable 
parts  of  30  problems  each  (based  on  15  mutilated  words  in  each  part) 

(2)  Administration.— The  following  are  from  the  directions  to  the 

test: 

Pilot*  must  be  able  to  detect  and  identify  camouflaged  objects.  The  purpose  of 
this  test  is  to  determine  how  welt  you  can  identify  words,  parts  of  which  have  been 
removed  or  camouflaged. 

Look  at  the  sample  problems  below.  (See  figure  8.1.) 

Both  sample  problems  above  are  based  on  the  camouflaged  word  at  the  left.  Two 
categories,  A.  Sport,  and  B.  Article  of  Clothing,  are  listed  for  problem  1.  The 
word  that  has  been  camouflaged  is  the  name  of  a  Sport  or  the  name  of  an  Article 
of  Clothing.  Your  task  is  to: 

a.  Think  of  words  you  know  in  each  category  until  you  discover  the  word  that 
has  been  camouflaged 

b.  Mark  your  answer  sheet  A  or  B  according  to  the  ^-rcct  category. 

The  same  procedure  is  repeated  for  problem  2  with  ;wo  new  categories. 

The  time  limits  for  the  test  are  as  follows :  practice  problems,  3  min¬ 
utes;  administration,  5  minutes;  part  I,  10  minutes;  part  II,  10  minutes; 
total  time,  28  minutes. 

(3)  Scoring. — The  scoring  formula  for  this  test  is  R—W. 

Ambiguous  Ink  Blots,  CI317A  11 

This  test  is  designed  to  measure  the  speed  with  which  varied  responses 
can  be  evoked  from  constant  stimuli  consisting  of  ink  blots.  The  test  was 
not  printed,  since  there  were  no  prospects  for  its  administration. 

Description.  (1)  Administration. — The  following  are  from  the  direc¬ 
tions  to  the  test : 

This  is  a  test  to  determine  the  number  of  objects  you  can  find  easily  in  a  mass  of 
blots  and  lines  which  at  first  may  appear  to  be  meaningless.  You  will  be  shown  a 
picture,  followed  by  the  names  of  15  objects.  Your  task  is  to  study  the  picture  and 
indicate  which  ones  of  the  15  objects  you  can  see  in  the  picture. 

Look  at  sample  problem  1,  and  decide  which  of  the  objects  A  through  O  can  be 
found  in  this  picture.  For  each  object  that  you  can  find,  blacken  the  appropriate 
space  on  your  answer  sheet.  Do  this  now.  (See  Figure  8.2) 


FIGURE  ft. 2 

SAMPLE  ITEM  OF  AMBIGUOUS  INK  BLOTS, 

CI3I7A 


u  Developed  il  I’lythntoKicil  Rtteirch  Unit  No.  2.  Chief  contributor:  Sufl/S|(t.  Arthur  2. 
Ctrl. 
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For  this  sample  problem,  the  listed  responses  arc:  A — Distilling  ap¬ 
paratus;  B — Baseball  catcher’s  mitt;  C — Coiled  spring;  D — Fire-hose 
nozzle;  E — Bar  stool;  F — Typewriter;  G — Oriental  lantern;  H — Two- 
wheeled  cart;  I — Stairway;  J — Desk  calendar;  K — Country  mail  box; 

L — Crown;  M — Turtle;  N — Cat's  head ;  and  O—Smoker’s  pipe. 

In  the  sample  problem  which  you  have  just  finished,  almost  everyone  is  able  to 
find:  A.  Distilling  apparatus;  C.  Coiled  spring;  E.  Bar  stool;  and  O.  Smoker’s 
pipe.  Some  people  are  also  able  to  find:  G.  Oriental  lantern;  K.  Country  null  box; 
and  N.  Cat's  head  A  few  are  able  to  find:  1.  Stairway;  and  L.  Crowa  Practically 
no  one  is  able  to  find  B.  Baseball  catcher's  mitt;  D.  Fire-hose  nozzle;  F.  Type¬ 
writer;  H.  Two- wheeled  cart;  J.  Desk  calendar;  and  M.  Turtle.  There  are  no 
absolutely  right  or  wrong  answers.  The  important  thing  is  that  you  indicate  ac¬ 
curately  the  objects  you  can  see. 

(2)  Scoring. — The  scoring  formula  for  this  test  is  the  number  of  re¬ 
sponses.  This  assumes  that  the  quality  of  response  is  irrelevant. 

Ambiguous  Figures,  CI316A 14 

This  test  is  believed  to  measure  the  ability  of  the  examinee  to  evoke  as 
many  relationships  as  possible  from  n  pair  of  geometric  figures. 

Description.  (1)  Administration. — The  following  are  the  directions 
for  the  test : 

This  is  a  test  of  your  ability  to  find  relationships  between  geometric  figures.  * 

Look  at  sample  problem  No.  1  below.  (See  figure  8.3.) 


X  Y  2 


A  B  C  0  E 


F  G  H 


FIGURE  9.3 

SAMPLE  ITEM  OF  AMBIGUOUS  FIGURES, 
CI3I6A 


M  Developed  tl  Pircholofidl  Hewirck  Unit  No.  2.  Chief  .contributor I.  T«hY$*t  FmI  C 
Davit,  Cipt  Jofcn  I.  Lacey*  Jims*  U  LipMa*  T«X/S|t  GertW  H. 


X  is  related  to  Y  in  several  ways: 

1.  Both  figures  have  the  same  shape. 

2.  Y  is  X  rotated  180*. 

3.  An  additional  line  has  been  drawn  from  Y. 

Your  task  is  to  find  which  of  the  choices,  \  through  J,  bear  2  or  more  of  the 
same  relationships  to  Z  as  Y  bears  to  X. 

Examine  choice  A.  It  fulfills  relationship  1,  since  both  figures  have  the  same 
shape.  It  has  also  been  rotated  180*  fulfilling  relationship  2.  Relationship  3  is  satis¬ 
fied,  as  a  line  has  been  drawn  across  the  top  part  of  the  figures.  Since  choice  A 
fulfills  the  requirement  of  having  at  least  2  relationships  to  Z,  it  is  correct.  Blacken 
A  opposite  No.  i  on  your  answer  sheet 

Now  examine  choice  B.  It  does  not  fulfill  relationship  1,  because  both  figures  do 
not  have  the  same  shape.  Since  the  line  through  Z  is  at  the  top,  while  the  line  in  B 
is  at  the  bottom,  B  may  be  considered  to  be  rotated  180*.  It  does  not  fulfill  relation¬ 
ship  3,  because  an  additional  line  has  not  been  drawn  across  the  top  of  the  figure. 
Since  it  meets  only  1  relationship,  it  cannot  be  considered  a  correct  choice.  Notice 
that  size  is  not  to  be  considered  as  one  of  the  relationships. 

Look  at  choice  C.  It  fulfills  relationship  l,  because  it  has  the  same  shape  as  Z. 
Since  it  has  bcecn  rotated  180*,  it  fulfills  relationship  2.  It  does  not  have  relation¬ 
ship  3.  However,  since  it  bears  at  least  two  correct  relationships,  it  is  a  correct 
choice.  Blacken  C  opposite  No.  1  or.  your  answer  sheet  now. 

Go  through  the  remaining  choices,  D  through  J,  and  blacken  the  correct  answers. 
Do  this  now.  You  should  have  marked  choices  A,  C,  D,  and  H.  If  you  have  not 
marked  them  correctly,  do  so  now. 

The  time  limits  are  as  follows:  Part  I — 12  minutes,  part  II — 12 
minutes. 

(2)  Scoring. — The  scoring  formula  for  this  test  is  the  number  right. 

Evaluation  of  Fluency  Tests 

Seven  tests  of  fluency  were  constructed.  Complete  coverage  of  the  area 
was  not  obtained,  inasmuch  as  no  tests  of  fluency  based  upon  numbers  or 
pictures  were  constructed,  because  of  difficulties  involved  in  constructing 
tests  using  these  media. 

Four  tests  (Camouflaged  Words,  CI323A;  Verbal  Recognition, 
CI322A;  Verbal  Recognition,  CI322B,  and  Ambiguous  Figures,  CI316A) 
were  administered  to  samples  ranging  from  400  to  2,500  for  correlational 
and  factor-analysis  purposes  along  with  a  large  group  of  experimental  and 
classification  tests.  The  intcrcorrelational  data  were  not  available  at  the 
time  this  was  written. 

FACTOR  ANALYSIS  OF  JUDGMENT  TESTS 

The  Data 

In  order  to  analyze  types  of  judgment  items  systematically,  a  special 
judgmcnt-and-rcasoning  test  was  constructed  by  the  Psychological  Branch 
of  the  Office  of  the  Air  Surgeon.  A  factor  analysis  utilizing  this  test  was 
later  performed  at  Psychological  Research  Unit  No.  3.“ 

A  subjective  analysi  of  judgment  items  had  suggested  that  the  follow¬ 
ing  elements  were  involved  in  answering  them : 

•  CVm(  Stkf/Sft.  Zxnjimia  Frwckur,  Cipt  U*H  G.  HMaftfcft 
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1.  Word  knowledge. 

2.  Factual  information  of  a  practical  type. 

3.  Logical  reasoning  ability. 

4.  Mechanical  comprehension  and  information. 

5.  Ability  to  make  common-sense  judgments. 

The  Tests. — To  measure  these  five  elements,  several  tests  were  con¬ 
structed,  including  four  separate  tests  of  reasoning  ability  and  two  tests 
of  mechanical  comprehension.  In  addition,  other  variables  whose  relation¬ 
ships  with  judgment  items  could  be  expected  to  throw  light  on  the  psy¬ 
chological  make-up  of  judgment  items  were  prepared  for  inclusion  in  a 
judgment-and-reasoning  test.  The  following  is  an  outline  of  the  test: 

Variable  1,  general  vocabulary,  item  Nos.  1-10:  These  items  are  in¬ 
tended  to  provide  a  measure  of  word  knowledge.  A  sample  item  follows: 


Deft: 

A. 

Skillful 

B 

Insane. 

C 

Oumsy. 

D. 

Split 

E. 

Light 

Variable  2,  ten  most  valid  judgment  items  in  the  AAF  Qualifying  Ex¬ 
amination,  AC  10 A,  items  11-20:  This  variable  includes  items,'  each  one 
of  which  had  been  found  to  have  positive  correlation  with  graduation- 
elimination  from  pilot  training.  A  sample  item  follows: 

A  radio  aerial  has  broken.  It  formerly  led  from  the  roof  of  a  house  to  the  top  of 
a  30-foot  pole  set  in  the  ground  outside.  It  is  not  safe  for  a  man  to  climb  this  pole  as 
it  is  only  4  inches  in  diameter  at  the  base  and  2  inches  at  the  top.  Of  the  following 
the  most  practical  way  to  put  up  a  new  aerial  would  be  to 

A.  Use  a  30-foot  ladder. 

B,  Take  the  pok  down,  attach  the  aerial,  and  then  reset  the  pole. 

C  Use  a  fishing  pole  to  hook  the  aerial  to  tlie  top  of  the  pole. 

D.  Make  a  noose  in  the  end  of  the  aerial  and  throw  it  over  the  top  of  the  pole. 

E.  Build  a  light  scaffold  around  the  pole. 

Variable  3,  ccnxmonsense  judgment ,  items  21-30:  This  part  is  in¬ 
tended  to  measure  ability  to  make  commonscnsc  judgments  rather  than 
logical  reasoning  ability  or  mechanical  comprehension  and  information.  A 
sample  item  follows: 

A  bomber  squadron  is  over  enemy  territory  on  its  way  to  bomb  an  oil  refinery 
when  one  of  the  observers  notices  an  advanced  enemy  airdrome.  He  notifies  the 
squadron  leader.  It  would  be  best  for  the  squadron  leader  to 

A.  Order  the  bombers  in  the  squadron  to  continue  as  planned  and  report  the 
location  of  the  enemy  airdrome  on  arrival  at  their  base. 

B.  Order  the  squadron  to  circk  the  airdrome  white  he  radios  its  position  to 

his  base.  .  _ 

C  Order  half  the  bomber?  in  the  squadron  to  bomb  the  enemy  airdrome  and 
the  other  half  to  carry  out  the  mission  against  the  oil  refinery. 

D.  Order  all  the  planes  in  the  squadron  to  bomb  tl*  enemy  airdrome  and 
return  to  their  base. 
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Ji.  Order  the  squadron  to  continue  as  planned  while  he  returns  to  his  base  to 
report  the  location  of  the  enemy  airdrome. 

Variable  4,  mechanical  judgment,  items  31-40:  This  part  is  intended  to 
measure  ability  to  make  judgments  based  mainly  on  mechanical  compre¬ 
hension  and  information.  Following  is  a  sample  item: 

A  soldier  accidentally  bent  the  front  sight  of  his  rifle  somewhat  to  the  right. 
Until  the  sight  is  repaired,  the  gun  will  shoot  too  far  to  the 

A.  Right  and  too  high. 

B.  Left  and  too  high. 

C.  Left,  but  neither  too  high  nor  too  low. 

D.  Right  and  foo  low. 

E.  Right,  but  neither  too  high  nor  too  low. 

Variable  5,  logical-reasoning  judgment,  items  41-50:  This  variable  is 
intended  to  measure  chiefly  the  ability  to  make  judgments  almost  wholly 
on  the  basis  of  logical  reasoning.  Common  sense  and  mechanical  com¬ 
prehension  arc  judged  to  be  unimportant.  A  sample  item  follows : 

An  ofliccr  is  in  command  of  an  advance  unit  in  a  night  foray  against  distant 
enemy  lines  to  test  their  strength  in  preparation  for  an  assault  against  them.  His 
orders  are  to  send  up  a  red  flare  if  the  enemy  is  not  prepared  for  an  attack,  a  blue 
flare  if  the  enemy  is  prepared  for  an  attack,  a  red  flare  and  then  a  blue  flare  if  the 
enemy  is  preparing  an  attack  of  its  own,  and  a  blue  flare  and  then  a  red  flare  if  the 
enemy  is  already  beginning  an  attack.  The  officer  finds  the  enemy  beginning  an 
attack,  but  instead  of  sending  up  first  a  blue  flare  and  then  a  red  one,  by  mistake 
he  sends  up  a  red  flare  first.  If  he  realizes  his  mistake  immediately,  it  would  be  best 
for  him  to 

A.  Wait  several  minutes  and  then  send  up  a  blue  flare  with  a  red  flare  follow¬ 
ing  immediately  after  it 

B.  Send  up  a  blue  flare  right  away  and  then  a  red  flare  immediately  after  it 

C.  Send  up  a  blue  flare,  wait  a  minute,  and  then  send  up  a  blue  flare  with  a 
red  flare  immediately  following  it 

D.  Send  up  a  blue  flare,  wait  a  minute,  and  then  send  up  a  red  flare  with  a 
blue  flare  immediately  following  it 

E.  Send  a  man  back  to  the  base  to  report  the  mistake. 

Variable  6,  deductive  reasoning,  items  51-60:  This  test  is  designed 
to  measure  ability  to  draw  a  logical  conclusion  from  a  problem  situa¬ 
tion.  Following  is  a  sample  item: 

An  inspector  general  has  an  appointment  in  a  city  one  hundred  miles  away.  It 
the  train  on  which  he  must  travel  is  late,  he  will  miss  his  appointment.  If  the  train 
is  not  late,  he  will  miss  the  train.  We  do  not  know  whether  the  train  is  late.  With 
this  information,  we  can  state  positively  that 

A.  He  will  not  be  able  to  keep  his  appointment 

B.  He  will  be  able  to  keep  his  appointment 

C  There  is  no  way  of  telling  whether  he  will  be  able  to  keep  his  appointment 

D.  He  will  have  to  take  a  later  train. 

E.  He  will  have  to  wait  for  the  train. 

Variable  7,  arithmetic  reasoning,  items  61-70:  This  variable  measures 
arithmetical  reasoning  ability.  Numerical  computation  is  minimized.  A 
sample  item  follows: 
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An  Army  truck  goes  10  miles  on  a  gallon  of  gasoline  and  60  miles  on  a  quart 
of  oil  If  there  were  8  gallons  of  gasoline  in  the  tank  and  1J4  gallons  of  oil  In  the 
motor,  how  far  could  this  truck  go? 

A.  70  miles. 

E  80  miles. 

C  90  miles. 

O.  170  miles. 

E.  440  miles. 

Variable  8,  information  in  judgment,  items  71-83:  Thirteen  item* 
were  included  in  this  part  to  test  knowledge  that  would  be  required  to 
answer  the  ten  judgment  items  included  in  variable  2.  A  sample  of  the 
items  in  variable  8  is  the  following  item,  which  was  constructed  to  test 
for  information  considered  crucial  in  answering  item  14  in  variable  2. 

A  wooden  flagpole  30  feet  tail  is  to  be  erected  in  front  of  a  school-house.  No 
guy  wires  or  other  supports  can  be  used.  Of  the  following  methods  of  setting  the 
pole  in  place,  the  one  that  would  be  easiest  and  yet  adequately  safe  would  be  to 

A.  Dig  a  hole  2-fcet  deep,  place  the  pole  upright  in  it,  and  replace  the  soil  in 
the  hole,  tamping  it  solid 

B.  Dig  a  hole  4- feet  square  and  4-feet  deep,  mix  enough  concrete  to  fill  the 
hole,  place  the  pole  upright  In  the  concrete,  and  support  it  in  that  pwitlwi 
until  the  concrete  hardens. 

C.  Secure  a  large  block  of  granite,  drill  a  hole  in  it  large  enough  for  the 
pole  to  fit  in,  place  the  block  in  a  hole  of  appropriate  site;  and  slide  the 
pole  into  the  hole  in  the  block. 

D.  Bore  a  hole  about  2-feet  deep  with  an  auger  slightly  larger  in  diameter 
than  the  pole,  insert  the  pole  and  fill  in  around  it  with  sand 

E.  Dig  a  hole  2-feet  square  and  5-feet  deep,  place  the  pole  upright  in  It,  and 
fill  in  around  the  pole  with  coarse  gravel. 

Variable  9,  mechanical  comprehension,  items  84-93:  This  variable  in¬ 
cludes  10  mechanical  comprehension  items  similar  to  those  in  test 
CI903A  (see  page  304  for  sample  item).  * 

Variable  10,  reasoning  in  reading,  items  94-103:  This  variable  con- 
sits  of  10  reading  comprehension  items  selected  because  they  appear  to 
measure  a  component  of  reading  comprehension  called  reasoning  in 
reading  (2).  Ability  to  make  inferences  is  stressed.  A  sample  item 
follows : 

One  of  the  most  beautiful  military  replies  I’ve  ever  heard  of  was  given  In  India 
by  a  captain  who  had  lost  a  steam  roller.  The  Government  sent  him  several  forms 
to  be  filled  out  before  it  could  be  replaced.  On  one  form  was  the  question:  “Reason 
tor  loss?"  The  captain  filled  in  the  words:  “Eaten  by  white  ants"  He  never  heard 
another  word  about  it,  but  in  due  course  of  time  his  replacement  arrived 

It  is  most  probable  that  the  captain: 

A.  Did  not  really  know  what  happened  to  his  steam  roller. 

B.  Told  the  truth  about  the  steam  roller. 

C.  Was  disgusted  at  haring  to  fill  out  so  many  forma. 

D.  Did  not  dare  tell  what  had  really  happened  to  the  steam  roller. 

E.  Did  not  care  whether  his  steam  roller  was  replaced 
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Variable  11,  syllogisms,  items  104-113 :  This  part  is  intended  to 
measure  logical  reasoning  ability  unaffected  by  habitual  modes  of 
thought.  The  directions  are: 

Read  each  one  of  the  following  items  as  you  com"  to  it.  Then  decide  whether  the 
last  sentence  in  each  item  necessarily  follows  if  the  preceding  statements  in  the 
item  are  accepted  as  true.  If  you  think  that  the  last  sentence  is  a  necessary  con¬ 
clusion,  make  a  mark  in  the  corresponding  space  on  your  answer  sheet  lettered  A. 
If  you  think  that  the  last  sentence  is  not  a  necessary  conclusion,  make  a  mark  in 
the  corresponding  space  on  your  answer  sheet  lettered  B. 

A  sample  item  is:  "Only  thieves  hide  jewels.  This  man  hid  jewels.  Therefore,  he 
n  ust  be  a  thief.” 

Variable  12,  mechanical  movements,  items  114-123:  This  variable  in¬ 
cludes  10  mechanical  movements  items  adapted  (by  permission)  from 
Thurstone’s  mechanical  movements  test.  The  items  arc  similar  to  the 
sample  given  on  page  317. 

Variable  13,  figure  analogies,  items  124-136:  This  part  contains  13 
figure  analogies  items  taken  (by  permission)  from  the  nonverbal  rea¬ 
soning  test  of  the  1942  National  Teacher  Examinations.  They  arc  simi¬ 
lar  to  the  customary  items  of  this  type  (see  page  105). 

Variable  14,  pattern  reasoning,  items  137-150:  The  last  variable  con¬ 
sists  of  14  pattern-analogies  items  taken  (by  permission)  from  the  non¬ 
verbal  reasoning  test  of  the  1942  National  Teacher  Examinations.  A 
sample  item  is  shown  in  figure  8.4.  The  directions  are : 


FIGURE  8.4 

SAMPLE  ITEM  OF  PATTERN- REASONING  SECTION  OF 
JUDGMENT  AND  REASONING 
TEST 

Each  item  of  the  following  section  consists  of  nine  diagrams  arranged  in  rows  of 
three  each.  The  diagrams  form  a  pattern.  Some  of  the  nine  diagrams  are  omitted, 
and  the  problem  is  to  determine  which  of  tlie  five  figures  given  as  choices  belongs 
in  the  ninth  space  (the  third  space  in  (he  third  row). 

In  assembling  each  part  of  the  judgment-and-reasoning  battery  every 
effort  was  made  to  select  items  that  did  not  overlap  the  mental  functions 
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supposed  to  be  tested  by  other  parts  of  the  test.  Data  that  were  available 
regarding  the  difficulty  of  items  made  it  possible  to  use  items  having  a 
sufficiently  wide  spread  of  difficulty  to  make  each  part  approach  a  minia¬ 
ture  power  test  It  would  have  been  preferred  to  use  a  larger  number 
of  items  in  each  part  to  provide  greater  reliability  of  measurement,  but 
limitations  of  time  and  the  desirability  of  confining  the  scoring  to  one 
side  of  a  standard  answer  sheet  prevented  the  use  of  longer  subtests. 
To  permit  virtually  every  examinee  to  attempt  every  item,  the  present 
form  of  the  battery  requires  3  hours  for  administration. 

The  Samples 

The  Judgment  and  Reasoning  Test  was  administered  to  a  sample  of 
689  eleventh-  and  twelfth-grade  boys  in  the  Stuyvesant  High  School, 
New  York  City  *•  and  to  a  sample  of  1,024  aviation  students  classified 
for  pilot  training  at  Psychological  Research  Unit  No.  2. 

Tables  8.9  and  8.10  present  the  intercorretations  of  the  14  part-scores 
of  the  Judgment  and  Reasoning  Test  for  the  high  school  and  aviation- 
student  samples,  respectively.  Table  8.11  gives  the  centroid  loadings  and 
communalities  for  both  analyses.  Table  8.12  gives  the  rotated  factor  load¬ 
ings  for  both  analyses. 

The  Factors 

In  the  following  paragraphs  the  rotated  factor  loadings  from  the  two 
analyses  will  be  discussed  together,  since  the  six  factors  are  practically 
identical  in  both.  The  analysis  based  on  high  school  students  b  labeled 
1,  that  on  aviation  students,  II. 

Rotated  factor  I  is  defined  by  the  following  data: 
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This  factor  is  undoubtedly  the  reasoning  I  (general-reasoning)  factor 
usually  best  defined  by  an  arithmetic-reasoning  test.  Some  of  the  db- 
crepancies  in  the  loadings  are  due  to  differences  in  variability  in  the  two 
groups.  On  most  reasoning  tests,  for  example,  the  aviation-student  pop¬ 
ulation  seems  to  be  more  variable,  as  can  be  seen  from  a  comparison  of 
the  variances  of  tests  loaded  with  this  factor.  The  difference  in  loadings 
for  figure  analogies  is  sufficiently  large  and  in  the  opposite  direction 
from  that  predictable  from  the  variances,  however,  to  suggest  a  differ¬ 
ence  in  function  tested  in  the  two  groups  by  this  test.  This  test  differs 
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from  Figure  Analogies,  C1212AX1,  in  that  the  principles  underlying  the 
analogies  arc  generally  more  subtle  and  difficult.  In  these  respects  the 
lest  is  probably  more  similar  to  Figure  Classification,  CI213AX1,  also 
known  to  have  a  near  zero  loading  on  the  general-reasoning  factor  in  an 
aviation-student  sample.  It  will  be  noticed  that  high-school  students  also 
tend  to  solve  mechanical-judgment  and  commonsense- judgment  items  by 
reasoning. 

When  test  items  are  too  difficult  for  the  group  examined  to  handle  in 
the  intended  fashion,  other  abilities  may  be  called  upon  if  motivation  is 
high.  The  easiest  thing  to  do  under  these  circumstances  is  to  seek  a  solu¬ 
tion  in  the  misleads  rather  than  to  seek  the  alternate  which  best  fits  the 
reasoned  solution.  This  may  involve  a  different  sort  of  reasoning  than 
that  called  for  by  an  arithmetic-reasoning  test. 

Rotated  factor  II  is  defined  by  the  following  data: 
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1 

n 

14 

Pattern  Reasoning  . 
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13 
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This  factor  was  not  well-defined.  Additional  information  is  now  avail¬ 
able,  however,  which  explains  the  difficulty  in  naming  it.  Tests  of  figure 
analogies  have  been  found  with  substantial  loadings  on  three  factors 
other  than  the  reasoning  I  factor.  One  of  these  has  been  termed  inte¬ 
gration  III,  another  reasoning  II,  and  the  third,  reasoning  III  (sec 
pp.  119f.).  This  factor  is  probably  closest  akin  to  reasoning  II,  but  the 
loading  of  figure  analogies  is  much  larger  than  usual  in  any  factor.  It 
is  possible  that  there  is  a  combination  of  two  factors  here ;  consequently, 
rotated  factor  II  will  be  named  reasoning  II  only  with  considerable 
hesitation.  The  Pattern  Reasoning  Test  has  never  appeared  in  any  other 
battery  for  analysis ;  nor  has  this  form  of  Figure  Analogies.  It  may  be 
that  these  two  forms  have  by  some  fortunate  circumstance  achieved  an 
unusually  high  degree  of  purity  (as  reasoning  tests  go)  for  one  of  the 
reasoning  factors. 

Rotated  factor  III  is  defined  bv  the  following  data: 
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This  is  clearly  the  mechanical -experience  factor.  Practically  all  of  the 
loadings  show  a  drop  from  the  high-school  to  the  aviation-student  sam¬ 
ple.  This  is  explained  by  the  decreased  variability  of  scores  of  the  latter 
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group  on  mechanical  tests,  which  can  be  traced  to  the  mode  of  their 
selection.  It  was  early  recognized  that  the  judgment  test  in  AC10A  had 
a  heavy  mechanical  variance.  The  selection  of  items  by  correlation  with 
the  pilot  criterion — also  heavily  weighted  with  the  mechanical  factor — 
also  favored  this  state  of  affairs  in  later  judgment  tests. 

Rotated  factor  IV  is  defined  by  the  following  data: 
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This,  the  verbal  factor,  seems  to  be  about  equally  well-defined  in  the 
two  groups.  The  larger  variance  of  the  Vocabulary  test  in  the  aviation- 
student  sample  has  increased  its  loading  with  the  factor  slightly.  Both 
analyses  show  that  judgment  tests  tend  to  have  low  but  probably  sig¬ 
nificant  loadings  in  the  verbal  factor.  It  would  be  desirable  to  depress 


this  variance  still  further,  particularly  if  a  judgment  test  is  to  be  used 
for  pilot  selection. 

Rotated  factor  V  is  defined  by  the  following  data : 
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Mechanical  Movements  . . . . 
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Mechanical  Judgment  . . 
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The  visualization  factor  is  about  equally  clear-cut  in  the  two  analyses. 
The  Mechanical  Judgment  test  is  a  poorer  measure  of  the  visualization 
factor  in  the  aviation-student  group,  members  of  which  have  more  me¬ 
chanical  information.  Since  it  is  impossible  to  state  the  statistical  sig¬ 
nificance  of  a  difference  in  two  factor  loadings,  however,  one  can  only 
speculate  about  a  difference  of  this  size. 

Rotated  factor  VI  is  defined  by  the  following  data: 
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The  high-school  sample  loadings  with  this  factor  (which  had  been 
identified  first  in  the  analysis  based  upon  aviation  students)  represent 
the  nearest  approach  to  those  for  the  aviation-student  sample  that  could 
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be  achieved  in  the  rotations.  Without  knowledge  of  the  existence  of  the 
factor  in  the  aviation-student  group  it  would  have  been  very  easy  to 
have  treated  the  sixth  factor  in  the  high  school  group  as  a  residual. 
Whatever  the  nature  of  factor  VI.  it  is  dear  that  any  unexplained 
validity  of  variable  3  (('ontinonsense  Judgment)  is  probably  due  to  its 
loading  here.  Since  the  comimiuality  of  this  variable  is  actually  some¬ 
what  greater  than  its  estimated  reliability,  we  can  conclude  that  its  non¬ 
chance  variance  is  completely  accounted  for  by  the  common  factors  in 
these  analyses. 

In  naming  this  factor,  the  fact  that  it  if  better  defined  in  the  aviation- 
student  group  is  of  considerable  importance.  The  variances  of  variable 
3,  Commonsense  Judgment,  are  almost  identical  in  the  two  samples,  but 
there  is  considerable  difference  in  the  mean  scores.  If  factor  VI  were 
an  interest  factor,  it  would  be  relatively  easy  to  account  for  such  a  shift. 
The  interest  could  be  in  things  aviation  or  military.  The  content  of  the 
test  which  best  defined  the  factor  is  congruent  with  this  hypothesis.  It 
would  therefore  be  reasonable  to  assume  that  this  interest  factor  would 
be  stronger  in  the  aviation-student  than  in  the  high  school  group.  A  sec¬ 
ond  hypothesis  is  that  the  factor  represents  judgment,  an  ability  tradi¬ 
tionally  unrelated  to  academic  work.  Tins  ability  might  also  be  at  a 
higher  average  level  in  the  more  mature,  less  academic  aviation-student 
group. 

It  is  very  reasonable  to  identify  this  factor  with  the  one  called  judg¬ 
ment  in  the  fores ight-and-planning  analyses  (see  eh.  9),  which  was  best 
defined  by  a  set  of  work-planning  judgment  items.  Other  tests  appear¬ 
ing  with  the  factor  in  these  analyses  were  the  Practical  Estimations  Test, 
Sequence  of  Maneuvers,  and  Competitive  Planning.  Judgment  is  cer¬ 
tainly  a  more  plausible  designation  than  interest,  for  tin's  group  of  tests. 


Validities  of  the  Judgment  and  Reasoning  Tests 

Scores  on  the  14  parts  of  the  Judgment  and  Reasoning  Test  were  cor¬ 
related  with  graduation-elimination  in  elementary  pilot  training.  The 


Table  8.1 3.~  Validation  data  for  the  fourteen  farts  of  the  judgment  and  reasoning 
battery  based  on  the  graduation  or  elimination  of  /46  filots  in  f rimary  training 

(f,~0.S6) 


Part 

Type  of  item 

1 

Vocabulary  . 

2 

3 

Commonsense  (pure)  Judgment 

4 

Mechanical  Judgment . 

5 

logical  Reasoning  Judgment 

6 

Deductive  Reasoning  ........ 

7 

Arithmetical  Reasoning . 

8 

Information  in  Judgment  .... 

9 

Mechanical  Comprehension  ... 

10 

Reasoning  in  Reading  . 
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.22 

.32 

7.56 

7.28 

.08 

.12 

5.94 

5.82 

.04 

.08 

6.40 

6.36 

.01 

.14 

801 

7.36 

.12 

.20 

6.10 

5.42 

.13 

.22 

*  Assuming  an  unrestricted  lUndin)  deviation  of  1.83, 
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validation  data  arc  presented  in  tabic  8.13,  for  a  sample  tested  at  Psy¬ 
chological  Research  Unit  No.  2  with  the  November  1943  classification 

battery. 

Conclusions 

Validities  obtained  for  variable  3  (Commonsense  Judgment)  indicate 
that  the  judgment  factor  has  an  appreciable  degree  of  pilot  validity. 
This  test  variable  has  only  one  other  sizeable  loading  in  the  aviation- 
student  analysis,  and  that  is  on  the  verbal  factor,  which  has  no  positive 
pilot  validity. 

Shifts  in  factor  patterns  between  the  high-school  population  and  the 
aviation-student  population  arc  of  at  least  two  sorts.  The  one  has  a  very 
simptc  explanation — differences  in  range  of  talent  affect  factor  loadings 
the  same  as  they  affect  other  correlation  coefficients.  Although  there  are 
a  few  tests  that  are  exceptions,  the  verbal  and  general -reasoning  factors 
are  better  defined,  and  the  mechanical-experience  factor  is  more  poorly 
defined  in  the  aviation-student  sample  for  this  reason. 

There  is  some  evidence,  on  the  other  hand,  that  certain  tests  measure 
different  abilities  in  the  two  populations.  The  Figure  Analogies  test  shows 
a  shift  from  the  reasoning  I  factor  in  the  high-school  group  to  the  rea¬ 
soning  II  factor  in  the  aviation-student  group.  The  judgment  factor  is 
more  clear-cut  and  is  defined  by  higher  loadings  in  the  aviation-student 
group.  Other  differences  between  the  two  analyses,  unexplained  by  the 
differences  in  variability,  are  not  as  large  and,  on  the  basis  of  present 
data,  probably  cannot  be  distinguished  from  sampling  fluctuations. 

SUMMARY  AND  EVALUATION  OF  JUDGMENT  TESTS 

Judgment  was  found  to  be  the  most  frequently  mentioned  psychologi¬ 
cal  category  to  which  flight  instructors  referred  when  giving  reasons  for 
eliminating  pilots.  In  an  attempt  to  measure  and  better  define  this  cate¬ 
gory,  a  scries  of  practical  judgment  tests  was  constructed.  Factor  analysis 
revealed  a  judgment  factor,  best  defined  by  the  work-planning  type  of 
item. 

In  an  attempt  to  understand  the  informational  basis  of  judgment,  a 
series  of  practical-estimation  tests  was  constructed.  Items  calling  for 
absolute  estimates  of  time,  distance,  etc.,  were  found  to  be  uncorrclatcd 
with  other  practical-estimation  and  practical-judgment  items.  Items  call¬ 
ing  for  relative  estimates  had  satisfactory  communality  with  judgment 
items.  Of  this  latter  type,  those  items  calling  for  relatively  complicated 
estimates  involving  time  as  well  as  distance  and  size  were  found  to  be 
significantly  loaded  with  the  judgment  factor,  whereas  those  items  calling 
for  relatively  simp'c  estimates  of  size  and  speed  contained  no  judgment 
loading.  Cne  inference  is  that  the  judgment  factor  is  a  thinking,  rather 
than  a  perceptual  or  memory  ability. 

Another  attempt  to  explore  the  judgment  category  was  based  upon  the 
assumption  that  the  fluency  with  which  hypotheses  can  be  evoked  would 
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be  an  important  element  in  arriving  at  the  correct  solution  to  a  judgment 
problem  situation.  A  series  of  tests  of  fluency  was  constructed.  Analytical 
results  for  these  tests  were  not  available  at  the  time  this  volume  was  being 
written. 
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Foresight  and  Planning  Tests1 


INTRODUCTION 


One  feature  contributing  to  success  as  a  pilot  is  the  ability  to  plan  a 
scries  of  maneuvers  or  activities  and  to  foresee  and  avoid  difficulties  that 
may  arise  in  their  execution.  Such  is  the  statement  of  the  problem  in 
common-sense  terms. 


Job-Analysis  Information  * 

Job-analysis  studies  give  foresight  and  planning  a  high-ranking  posi¬ 
tion  among  pilot  qualifications. 

In  elementary  training. — In  an  analysis  of  the  faculty-board  proceed¬ 
ings*  which  report  why  student  pilots  were  eliminated  from  further 
training  at  the  elementary  stage,  lack  of  foresight  and  of  planning  ability 
was  reported  in  over  a  third  of  the  cases.*  Instructors  stated  that  the  stu¬ 
dents  lacking  this  ability  failed  to  plan  ahead  properly  for  landings,  made 
incorrect  and  dangerous  entrances  and  exits  from  traffic,  were  unable  to 
plan  forced  landings  properly,  flew  the  traffic  pattern  improperly,  etc 

In  another  analysis  of  faculty-board  proceedings,  data  were  fraction¬ 
ated  according  to  the  number  of  flying  hours  and  the  results  in  table  9.1 
were  obtained.  The  frequency  of  deficiency  in  foresight  and  planning 
among  eliminecs  at  different  stages  of  primary  pilot  training  is  thereby 
shown.  From  this  it  can  be  seen  that  the  deficiency  remains  uniformly 
important  after  the  first  five  hours  of  flying  lessons. 

Table  9.1. —  Percentage  of  eliminees  from  pitot  training  theming  deficiencies  m 
foresight  and  planning  at  different  stages  in  training 


Hotira  of 

elementary  traioinc 

Percent  of 

Himlneet  deficient 

1-4 

1 

i-t 

4J 

•-1I 

44 

IJ-U 

44 

•  17-14 

4) 

2S-S4 

4) 

M-40 

>4 

In  another  study  in  which  rating  scales  were  filled  >ut  by  instructors 
for  1,303  cadets,  lack  of  foresight  and  planning  was  named  as  a  cause  of 


•  Written  ky  T/Sft.  Sanford  J.  Stock  and  tk«  editor. 

*  In  aviation  rel*>nt  forevirht  and  nlannm*  are  usually  regarded  at  one  ability.  It  rtMWI  to 
lie  demonstrated  that  thit  it  a  p»/tkolo*»cil  fart,  or  even  tkat  the  t*ro  are  Irreducible  catefone* 

•See  table  I.J. 
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elimination  in  43  percent  of  the  eases.  This  placed  foresight  and  planning 
as  the  second  most  frequently  indicated  deficiency  in  a  group  of  20 
categories. 

In  a  study  in  which  instructors  rated  students  after  8  to  10  hours  of 
flying  instruction,  biscrial  coefficients  of  correlation  were  computed  be¬ 
tween  various  traits  and  graduation  elimination  from  primary  training. 
The  rating  on  foresight  and  planning  for  369  cadets  correlated  0.67  with 
graduation  elimination.  In  the  same  study,  foresight  and  planning  was 
mentioned  as  a  deficiency  in  38  percent  of  the  diminces. 

In  a  study  of  landing  planes,  it  was  reported  that  while  flying  the 
downwind  leg,  the  pilot  must  plan  where  to  make  the  90°  turn  into  the 
base  leg.  Before  the  turn  toward  the  landing  lane,  the  throttle  is  cut  and 
a  glide  established.  The  pilot  must  judge  accurately  where  to  cut  the 
throttle,  how  fast  the  airplane  is  gliding,  and  when  to  make  the  gliding 
turn.  Placing  the  gliding  turn  requires  accurate  judgment  and  planning. 

Eighty-eight  students  in  primary  training  filled  out  forms  indicating 
their  greatest  worries  during  landing.  Placing  of  the  gliding  turn  was 
the  fourth  most  frequently  mentioned  worry.  Eighty-four  students  and 
their  instructors  were  interviewed  individually  to  determine  what  they 
thought  were  the  chief  problems  in  landing.  Placing  the  gliding  turn 
correctly  was  the  third  most  frequently  mentioned  item. 

Later  training  and  combat. — Similar  facts  were  obtained  at  advanced 
stages  of  training.  A  summary  was  made  of  the  frequency  with  which 
various  reasons  were  stated  by  the  faculty  board  for  the  elimination  of 
100  students  in  single-engine  training  and  100  students  in  twin-engine 
training.  Forty-seven  percent  of  the  climinces  from  single-engine  training 
and  23  percent  of  the  climinces  from  twin-engine  training  were  listed  as 
deficient  in  foresight  and  planning.* 

Data  are  available  on  the  final  disposition  of  100  unsatisfactory  pilots 
reclassified  by  a  flying  evaluation  board  in  operational  training  or  in 
combat.  Twenty-two  percent  of  the  100  reclassified  pilots  were  listed  as 
deficient  in  the  category  of  intelligence  and  judgment  of  which  foresight 
and  planning  was  regarded  as  a  component.  This  deficiency  was  given 
as  a  reason  for  reclassification. 

Through  the  Informational  Intelligence  Division  of  the  Army  Air 
Forces,  reports  were  obtained  of  interviews  with  American,  British,  and 
Chinese  individuals  and  groups  in  combat,  concerning  efficiency  of  air 
crew,  morale,  training,  operations,  ct:.  The  interview  material  was  organ¬ 
ized  into  psychological  and  quasi-psychological  concepts  relevant  to  air¬ 
crew  selection. 

The  statements  about  the  fighter  pilot  include: 

Automntieity  in  combat  or  while  in  flight.  A  good  pilot  is  busy  all  the  time — must 
plan  ahead.*  •  •  You  must  plan  what  you  are  going  to  do  while  on  the  ground 
—you  must  thinlc  in  advance  what  you  arc  going  to  do  up  in  the  air.*  •  •fore¬ 
sight  and  planning  are  important  for  the  bomber  pilot  *  *  *  It  is  resource, 

•Set  uVU  1.1. 
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no*  daring,  that  makes  a  successful  operation.  The  more  time  spent  in  preparing 
the  flight,  the  better  the  chances  of  success. 

The  Informational  Intelligence  Division  study  also  reported  that  * 
characteristic  of  the  successful  navigator  is  "planning  and  foresight,  in¬ 
cluding  being  prepared  and  fully  briefed,  convinced  of  what  he  has  to 
do  and  what  to  do  in  an  emergency.'* 

TESTS  OF  PATHWAY  PLANNING 

It  is  often  necessary  for  the  pilot  or  navigator  to  plan  an  aerial  route, 
subject  to  certain  restrictions.  Finding  the  target  and  returning  to  the 
home  base  are  examples  of  situations  in  which  pathway  planning  is  re¬ 
quired.  While  the  target,  or  base,  might  be  reached  by  several  different 
routes,  the  limitations  of  the  situation  may  actually  permit  only  one 
approach.  The  gasoline  load,  position  of  enemy  antiaircraft,  likelihood 
of  meeting  enemy  fighters,  weather  conditions — these  arc  types  of  limita¬ 
tions  which  may  force  the  pilot  or  navigator  to  select  the  one,  and  only 
one,  appropriate  path  to  the  objective.  The  need  for  the  ability  to  plan 
routes  prompted  the  development  of  the  Route  Planning  and  Planning  a 
Circuit  tests. 

Route  Planning,  €1411  AX  * 

Route  Planning  and  Map  Planning  were  constructed  as  paper-and- 
pencil  forms  of  the  Foresight  and  Planning  Maze  Test,  CI405A,  an 
apparatus  test.  0405 A  consists  of  a  slot-maze  board  to  be  used  with  a 
stylus.  The  parallel  straight  alleys  intersect  at  acute  angles  forming  dia¬ 
mond-shaped  islands,  each  with  an  electric-light  bulb  in  its  center.  One 
of  the  bulbs  is  lighted  to  become  the  goal  of  the  moment.  Various  paths 
lead  to  the  goal,  some  being  short  and  economical,  and  many  others  arc 
longer  and  less  direct.  The  blocked  passages  arc  visible  to  the  examinee 
who,  on  the  signal,  inserts  the  stylus  at  the  entrance  to  the  maze.  A  light 
appears  on  one  of  the  diamonds  and  remains  lighted  for  15  seconds  dur¬ 
ing  which  the  examinee  plans  his  course  but  does  not  move.  When  the 
light  goes  out,  the  examinee  immediately  starts  for  the  goal  diamond 
and  is  allowed  10  seconds  to  reach  it.  This  cycle  of  events  is  then  re¬ 
peated  with  a  new  starting  point  and  a  new  goal. 

Description  —  In  Route  Planning,  CI411AX,  the  examinee  must  plan 
a  path  successively  from  four  points  on  the  periphery  of  a  printed  maze 
to  a  goal  box  in  its  center  (see  fig.  9.1).  There  arc  four  item  mimbc'S, 
one  at  each  corner  of  the  maze.  Each  number  is  the  starting  point  for  an 
item.  The  darkened  square  near  the  center  of  the  maze  is  the  common 
goal.  The  task  is  to  locate  the  one  point  through  which  one  must  f  &ss  in 
going  from  each  starting  point  to  the  goal.  Each  group  of  four  •  terns  is 
based  on  a  pair  of  identical  maze  patterns,  one  that  the  examines  studies 
briefly  and  one  that  he  uses  in  making  his  answers.  Tn  the  lat'er,  letters 
mark  the  various  pathway.,  to  the  center. 

•  Drvtleped  at  JVcketojk.1  Rtw.rck  Unit  N».  S.  CVrf  U.  Wi!*i»«  M.  Wlnfer. 
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FIGURE  9.1 

SAMPLE  STUDY-MAZE  OF  ROUTE  PLANNING, 

CI4IIAX 

(1)  Internal  characteristics. — The  test  consists  of  2  sample  study 
mazes,  providing  8  recorded  but  unscored  sample  items,  4  to  each  maze, 
and  36  scored  items  based  on  9  different  maze  patterns.  There  are  2 
parts  of  16  and  20  items  respectively.  The  mazes  in  part  I  have  fewer 
lines  and  are  simpler  than  those  in  part  II.  Each  maze  in  the  test  proper 
covers  an  area  10  inches  by  6x/i  inches.  The  first  sample  maze  is  2l/2 
inches  square  and  the  second  sample  maze  is  4*/^  inches  square. 


FIGURE  9.2 

SAMPLE  ANSWER-MAZE  OF  ROUTE  PLANNING, 

CI4IIAX 

(2)  Administration. — The  amount  of  time  allowed  for  studying  each 
diagram  and  for  answering  questions  on  its  mate  varies  as  follows: 


hem  N*. 

Umt 

( imnulfi) 

,\ni«rrr  time 
(  minulft) 

hem  No. 

time 

(  nmiutft) 

\ 

i 

l.'O 

0.71 

21-24 

1  50 

$ 

■ 

1  25 

75 

25  2S 

1.50 

« 

12 

1  25 

29-J2 

2  50 

ii 

16 

1  21 

.75 

11-16 

2.50 

17 

20 

1.50 

.75 

17-40 

2.50 

Ant»tr  lime 
(minutrO 
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Following  are  parts  of  the  directions,  .tnd  sample  mazes  arc  given  in. 
figures  9.1  and  9.2. 

This  is  a  Irst  of  yolir  abilily  to  plan  a  route  between  two  point*. 

!  cx)k  at  the  di.t>,i.tm  Ixtltnv.  (Sec  figure  9.1.)  Notice  that  there  arc  tour  numbers, 
one  at  each  corner  of  the  marc.  Each  of  these  num!>crs  is  a  starting  point.  The 
darkened  square  near  the  center  of  the  maze  is  the  common  goal.  Each  number  is 
connected  with  the  goal  by  one  or  more  lines  These  lines  are  the  routes  that  you 
must  follow  in  going  from  each  starting  point  to  the  goal 

Now  study  the  various  routes  in  the  maze,  bind  the  point  or  points  through  which 
you  must  pass  in  going  from  each  starting  point  to  the  goal. 

(After  20  seconds.)  Turn  the  page.  Now  look  at  the  maze  below.  (See  figure  9.2.) 
This  maze  is  identical  to  the  maze  you  have  just  studied  except  that  there  ?r« 
letters  on  the  various  routes.  Your  task  is  to  find  the  one  letter  through  which  you 
must  pass  in  going  from  each  starting  point  to  the  goal.  In  going  from  91  to  the  goal, 
for  example,  you  may  pass  through  either  A  orC;  however,  you  mi  st  pass  through 
D;  therefore  D  is  the  right  answer.  Now  examine  the  route  between  92  and  the 
goal.  Any  route  you  follow  takes  you  through  B;  therefore  B  is  the  right  answer 
for  92.  Similarly,  in  going  from  93  to  the  goal,  you  must  pass  through  G;  there¬ 
fore  G  is  the  right  answer.  In  going  from  94  to  the  goal,  you  may  pass  through 
either  I  or  F;  however,  you  must  pass  through  D;  therefore  D  is  the  right  answer. 
(Note  that  D  is  the  right  answer  for  item  94  as  well  as  for  item  91.) 

The  test  will  proceed  as  follows.  First,  you  will  be  shown  a  maze  and  told  how 
long  you  will  have  to  study  it.  After  this  study  period,  you  will  be  told  to  turn  the 
page  and  you  will  see  a  second  maze,  identical  to  the  one  you  studied. 

(3)  Scoring. — The  scoring  formula  is  R— W/2. 

Statistical  results. — The  data  given  below  arc  for  examinees  of  Psy¬ 
chological  Research  Unit  No.  3. 

Distribution  statistics. — Available  distribution  data  arc  given  in 
table  9.2. 


Table  9.2. —  Distribution  constants  for  Route  Planning,  CI4UAX 


Group 

N 

M 

SD 

107 

JS.J 

4.S 

764 

23.9 

M 

1 

•  Ttstfil  in  May  IW1, 

*  In  CUase*  44E,  44F, 


44  G,  *nd  44H. 


(2)  Reliability  coefficient. —An  alternate- forms  (part  I-part  II)  reli¬ 
ability  coefficient  of  0.77,  corrected,  was  obtained  from  a  sample  of  167 
unclassified  aviation  students  tested  in  May  1943.  Since  the  two  parts 
are  not  entirely  comparable,  this  is  a  rough  estimate. 

(3)  Factorial  composition  .—The  chief  loadings  are  in  the  planning 
(0.47).  integration  III  (0.37),  visualization  (0  29),  and  general-reason¬ 
ing  (0.22)  factors.  The  communality  equals  0.63,  which  is  somewhat 
short  of  its  reliability  (0 .77). 

(4)  Test  validity.— Validity  for  pilot  training  is  indicated  in  table  9.3. 
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Taulk  9.3.-  -  Validity  data  for  Route  Planning,  CIHl.  lX,  graduation-elimination 

criterion 


Croup 

N, 

M. 

M, 

SU, 

rH# 

Filota  in  primary  ir.iininK*  . 

764 

0.88 

24.01 

22.02 

7.47 

0.07 

0.15 

Pilot*  through  hade  training* . 

;  4SS 

.85 

24.02 

22.97 

6.95 

.15 

.21 

1  A.iuiniiig  an  unrestricted  amine  distribution  of  2.00. 

*  In  classes  4  It,  44h,  44C,  ant*  4411. 

•  in  clais  44F. 


Evaluation. — Route  Planning,  CI411AX,  has  a  validity  for  pilot  suc¬ 
cess  of  approximately  0.16  that  is  fully  accounted  for  by  known  factors 
as  shown  in  chapter  28  (Table  2S.18).  It  has  no  unique  variance  to  offer 
except  in  the  planning  factor  which  has  low  validity  for  pilots.  It  is  fac- 
torially  complex  and  its  known  factors  arc  better  measured  by  other 
tests.  The  directions  for  route  planning  arc  relatively  complex.  It  is 
therefore  not  a  strong  candidate  for  the  air-crew  classification  battery. 

Planning  a  Circuit,  CI401A  * 

Planning  a  circuit  presents  a  problem  situation  in  which  there  is  one 
and  only  one  appropriate  path  to  the  objective,  but  a  pathway  that  is 
obscured  by  a  distracting  mare  of  other  pathways.  An  early  form  was 
developed  under  the  title  of  Electrical  Maze  Test,  CP401A. 

Description. — Each  hem  consists  of  an  electrical-circuit  diagram  with 
many  intersecting  and  intermeshed  wires  with  several  sets  of  terminals. 
The  task  is  to  trace  the  circuits  and  to  determine  at  which  pair  of  termi¬ 
nals  a  battery  should  be  placed  in  order  to  complete  the  circuit  through  a 
meter. 

(1)  Internal  characteristics. — The  test  contains  1  unrecorded  and  un- 
scorcd  sample  problem,  2  recorded  but  unscored  sample  problems,  and 
42  scored  items. 

(2)  Administration.-  Fourteen  minutes  are  allowed  for  completion 
of  the  test.  After  12  minutes  have  elapsed,  the  administrator  warns  that 
only  2  minutes  remain.  Following  are  the  directions  and  sample  items. 
Fgurc  9.3  is  the  sample  problem  utilized  in  tlu  irections.  Figure  9.4 
is  an  example  from  the  test  proper,  illustrating  one  of  the  more  complex 
items. 

Suppose  that  each  of  the  following  diagrams  illustrates  the  wiring  of  the  dash¬ 
board  or.  an  airplane.  The  small  box  at  tire  top  represents  one  of  the  meters  on  the 
panel.  In  order  for  the  meter  to  work,  a  baltcry  must  be  placed  in  the  circuit  at 
cither  A,  D,  C,  D,  or  E.  Only  one  of  these  points  will  successfully  complete  tlx 
circuit  with  but  one  battery.  Your  task  is  to  find  that  place  where  a  battery  can  be 
placed  so  that  the  meter  will  work;  that  is,  which  will  complete  a  circuit  through 
the  meter. 

From  the  example  below,  you  can  see  that  at  one  and  only  one  place,  such  as  C, 
can  a  battery  be  put  in  so  as  to  complete  the  circuit  successfully.  All  other  choices, 
A,  II,  I),  and  E  arc  incorrect;  cither  they  arc  connected  with  another  point  at  which 
a  battery  would  have  to  '  placed  to  make  a  complete  circuit,  or.  both  wires  from 

*  Developed  <1  Headquarter:,  Army  Air  Force*.  Chief  contributor*:  La.  Frank  J,  Dudek,  Col. 
John  C  Flanagan. 


one  of  the  points,  A  for  example,  go  to  the  same  pole  of  the  meter.  The  effect  of 
this  is  to  short  out  a  battery  at  that  place  so  that  it  will  not  work.  Dots  in  the 
figure  represent  connections;  that  is,  the  two  wires  are  joined  at  that  end.  Where 
.here  is  no  such  dot,  the  insulated  wires  simply  cross  each  other  but  mo  connection 
is  made. 


FIGURE  9.3 

SAMPLE  ITEM  OF  PLANNING  A  CIRCUIT, 
CI40JA 


A  B  C  D  E 


FIGURE  9.4 

A  DIFFICULT  ITEM  FROM  PLANNING  A 
CIRCUIT,  CI40IA  ^ 

(3)  Scoring. — The  scoring  formula  is  R—W/4. 

Statistical  results.  (1)  Distribution  statistics. — Tabic  9.4  presents 
distribution  data  for  two  samples. 


Table  9.4. —  Distribution  statistics  for  Planning  a  Circuit,  C1401A 


f  nctisjffied  aviation  student*1 

. . 


1  Tested  in  December  1942  at  Psychological  Research  Unit  No.  J. 

*  Sample  of  SOS  unclassified  aviation  student*.  156  pilot  elimineei,  and  S  uneliuifird  students 
eliminated  for  medical  reason*.  Test  administered  with  a  20-minute  time  limit  in  April  1943  at 
s'sycbolofticaJ  Research  Unit  No.  1. 
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(2)  Reliability  coefficient. — On  the  mixed  sample  of  6 69  eases  (sec 
footnote  2  to  table  9.4),  an  estimated  reliability  coefficient  of  0-96,  cor¬ 
rected,  was  obtained  by  the  odd-even  method.  Since  this  is  a  speed  test, 
this  figure  is  a  serious  overestimation. 

(3)  Factorial  composition. — The  leading  factors  and  loadings  arc  per¬ 
ceptual  speed  (0.41),  planning  (0/10),  spatial  relations  (0.28),  and 
verbal  (0.24).  The  communality  is  0.57.  For  a  fuller  picture  of  the  fac¬ 
torial  composition  of  this  test,  see  appendix  B. 

(4)  Test  validity. — Validity  data  for  pilots  only  are  presented  in 
table  9.5. 

Tabie  9.5. —  Validity  data  for  Planning  a  Circuit  Test,  C 1401 A  and  comparable 
forms,  based  upon  graduation  elimination  of  pilots  from  primary  training 


SD, 

rii* 

7.86 

10.15 

6.77 

0.19 

.28 

.16 

CU 01A  . 

QP90IA*  . 

ACI2I* . . 


1  Assuming  an  unrestricted  staninc  standard  deviation  of  2.00. 

•Tested  in  November  and  December  1942  at  Psychological  Research  Unit  t>-  S. 

•These  contain  the  same  45  items.  See  report  No.  6  on  the  AAF  Qualifying  Examination. 
Croups  and  testing  dates  are  not  identified. 

Evaluation. — Planning  a  Circuit  demonstrated  relatively  high  validity 
for  pilot  success  (composite  mean  of  0.26,  based  upon  form  CI401A 
and  comparable  forms),  which  is  exactly  accounted  for  by  its  known  fac¬ 
tors  and  their  loadings  (see  table  28.18).  Its  factorial  complexity,  how¬ 
ever,  makes  it  undesirable  except  where  its  particular  combination  of 
factors  is  desired.  This  combination  seems  to  coincide  well  with  pilot 
requirements.  It  might,  therefore,  be  used  in  a  preliminary  selective 
battery  for  pilots,  such  as  the  AAF  Qualifying  Examination. 

TESTS  OF  ECONOMICAL  PROCEDURES 

In  training  and  in  combat,  complex  situations  continually  occur  in 
which  various  alternative  actions  are  possible.  Several  of  the  alternative 
actions  may  well  lead  to  success.  Success,  in  the  sense  of  reaching  the 
goal,  however,  is  not  sufficient.  For  although  the  goal  may  be  achieved, 
the  act  of  achieving  may  be  too  costly  in  terms  of  effort,  time,  or  mate¬ 
rial.  The  pilot,  bombardier,  or  navigator  must  engage  in  processes  of 
selection— not  only  to  select  correct  actions,  but  to  select  and  execute  the 
action  which  is  most  appropriate  and  most  economical.  He  must  foresee 
the  shortest  route,  the  fastest  method,  the  simplest  procedure.  He  must 
save  time,  effort,  material. 

A  group  of  tests  was  designed  to  measure  this  ability  to  follow  the 
most  economical  procedure  in  situations  where  various  alternative  ac¬ 
tions  are  presented.  These  tests  are  Map  Planning,  Organizational  Plan¬ 
ning,  Planning  Air  Maneuvers,  and  Sequence  of  Maneuvers. 

Mop  Planning,  CI412AX  1 

This  is  the  first  and  only  form  of  test  by  this  name.  It  is  the  second 

t  Developed  at  Psychological  Research  Unit  No,  3.  Chief  contributor:  S/Sgt  Wayne  S. 
Zimmerman. 


of  two  tests  designed  to  parallel  the  function  of  the  Planning  Maze  Te*t, 
0405 A,  a  psychomotor  test. 

Description. — The  examinee  secs  diagrammatic  sections  from  city 
maps  showing  damage  to  streets  following  a  honihiug  raid.  The  streets 
are  blocked  at  various  points  by  barriers  represented  as  bomb  craters. 
1  he  examinee  must  plan  routes  for  military  vehicles  to  travel  through 
the  damaged  areas.  The  task  is  to  find  the  shortest  passable  route  as 
quickly  as  possible. 

(1)  Internal  characteristics. — Map  Planning,  CI412AX,  contains  four 
recorded  but  unscored  sample  items,  all  appearing  in  one  diamond-maze 
sample  map.  There  are  46  scored  items  in  5  mazes;  6  in  the  first  maze, 
8  in  the  second,  12  in  the  third,  10  in  the  fourth,  and  10  in  the  fifth. 

(2)  Administration. — Each  map  or  maze  is  timed  separately  with 
from  1.5  to  3.0  minutes  per  map  being  allowed.  Total  testing  time  in¬ 
cluding  directions  is  13  minutes. 

Following  are  the  directions  and  the  sample  items.  T*.c  sample  map 
(fig.  9.5)  included  is  much  reduced  in  size  compared  with  the  mazes 
found  in  the  test  proper. 

This  is  a  test  of  your  ability  to  plan  a  route  between  two  points.  You  will  be 
shown  sections  from  city  maps  showing  damage  to  streets  following  a  bombing 
raid.  Assume  that  you  must  plan  routes  for  military  vehicles  to  travel  through  the 
damaged  area.  Your  task  will  be  to  find  the  shortest  passable  routes  as  quickly  as 
possible. 

Look  at  the  sample  map  below.  Circles  show  places  where  falling  bombs  have 
rendered  streets  impassable.  Note  the  numbers  that  appear  on  the  margin  of  the 
map.  Beginning  with  1  at  the  upper  left,  the  numbers  go  in  a  clockwise  direction 
around  the  edges.  These  numbers  indicate  the  points  between  which  you  must  plan 
routes.  Note,  now,  the  small,  square  buildings  within  the  map  identified  by  letters 
of  the  alphabet.  The  shortest  route  between  any  two  points  will  take  you  past  one. 


FIGURE  9.5 

SAMPLE  MAP  OF  MAP  PLANNING, 
CI4I2AX 


701120 — <7 — IS 
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and  only  one,  of  these  lettered  buildings.  This  is  illustrated  on  the  map  by  practice 
problem  number  1  below: 

Find  the  shortest  route  from: 

1.  1  to  2. 

Do  that  now. 

The  shortest  route  takes  you  by  building  B,  so  mark  B  on  your  answer  sheet 
after  item  number  1.  If  you  passed  mote  than  one  building  on  your  way,  you  did 
not  find  the  shortest  route.  In  every  problem  there  is  just  one  building  on  the 
shortest  route  between  two  numbered  points.  Work  practice  problems  2,  3,  and  4 
below.  For  items  2,  3,  and  4  on  your  answer  sheet,  mark  the  letter  corresponding 
to  the  building  that  you  must  pass. 

(3)  Scoring. — The  scoring  formula  is  R— W/2. 

Statistical  results. — The  data  for  this  test  are  limited  but  sufficient  to 
permit  an  evaluation  of  its  usefulness.  The  samples  were  tested  at  Psy¬ 
chological  Research  Unit  No.  3. 

(1)  Distribution  statistics. — Distribution  constants  are  given  in 
table  9.6. 


Taboe  9.6. —  Distribution  ionstants  for  Map  Planning,  CI412AX 


Group 

N 

M 

SD 

UiKlmikd  aviation  iludoiti1 . 

167 

26.4 

6.9 

Clamticil  pilot*’ . . 

864 

20.2 

6.9 

•  In  (Iuki 


ami  44G. 


(2)  Reliability  coefficient. — Correlating  the  scores  on  maps  1,  2  and  4, 
with  the  scores  on  maps  3  and  5,  a  reliability  coefficient  of  0.78,  cor¬ 
rected,  was  obtained  on  a  sample  of  167  unclassified  aviation  students 
tested  in  May  1943. 

(3)  Difficulty. — The  difficulty  level  of  items  in  the  test  is  indicated 
by  the  mean  proportion  of  correct  responses  equal  to  0.87,  based  on  a 
group  of  684  classified  pilots.  Standard  deviation  of  the  difficulty  values 
was  0.09  and  the  range  0.53  to  0.97. 

(4)  factorial  composition. — The  prominent  loadings  are  in  the  per¬ 
ceptual-speed  (0.45),  general-reasoning  (0.31),  visualization  (0.28), 
and  spatial-relations  (0.27)  factors.  The  communality  of  0.57  is  suffi¬ 
ciently  short  of  the  reliability  (0.78)  as  to  indicate  unknown  common 
factors. 

(5)  Test  wlidity. — Validation  results  based  on  several  samples  are 
given  in  table  9.7. 

Tablx  9.7. —  Validity  data  for  Map  Planning  CM  12 AX  based  upon  samples  of 
pilots,  u-itk  graduation-elimination  criterion 


Group 

Can 

N, 

B 

M. 

SD, 

rM« 

In  primary  training  . 

4tF 

404 

0.91 

20.42 

18.20 

■fVI 

0.16 

0.2S 

Tktomk  banc  training  . 

«4F 

412 

.89 

20.17 

18  7J 

■Ml 

.10 

.17 

In  primary  training  . 

44<; 

460 

.89  1 

20  60 

17  42 

7.11 

.25 

.JO 

In  primarf  tramme  .  ... 

4411 

I9J 

.85  j 

25.47 

25  65 

6.51 

-.02 

.04 

In  primary  training  . 

441 

254 

.82  j 

18.29 

1*0*  I 

6.60 

.02 

.07 

(6)  Item  validity. — The  validity  of  responses  in  this  test  is  indicated 
by  a  mean  phi  of  0.08  with  a  range  of  phis  from  —0.11  to  0.28  and  a 
standard  deviation  of  0.07.  The  data  arc  based  upon  responses  of  600 
graduates  and  84  eliminces  in  classes  44 F  and  44G. 

Evaluation. — This  test  contains  a  number  of  valid  factors  which  ex¬ 
actly  account  for  its  average  pilot  validity  of  0.21.  It  has  no  unique 
variance  to  offer  for  pilot  selection,  but  if  it  were  not  for  its  general 
reasoning  component,  it  might  still  be  used  in  a  pilot-selection  battery 
where  pure  tests  are  not  demanded.  The  combination  of  factors  is  even 
better  for  navigator  selection,  for  which  it  would  probably  validate  to 
the  extent  of  0.30. 

Organizational  Planning,  CI407BX  ' 

This  is  the  second  and  final  form  of  another  test  in  the  economical- 
procedures  subarea. 

Description. — A  schematic  map  of  a  town  with  various  numbered 
buildings  (post  office,  gas  station,  hardware  store,  etc.)  is  presented. 
The  task  is  to  plan  and  organize  the  shortest  possible  route  which  will 
include  a  series  of  stops.  The  examinee  must  foresee  certain  problems 
in  connection  with  the  most  available  and  shortest  routes  and  must  plan 
accordingly. 

(1)  Internal  characteristics. — The  test  contains  one  unrecorded  and 
unscorcd  sample  problem  and  42  scored  items  based  on  the  map  of  a 
town. 

(2)  Administration. — Five  minutes  arc  required  for  the  directions, 
and  the  time  limit  for  the  test  is  50  minutes. 

Following  are  the  directions  and  sample  problem.  The  map  is  shown 
in  figure  9.6. 

This  is  a  test  to  see  how  well  you  can  interpret  a  map.  In  some  of  the  questions 
you  will  be  asked  to  organize  a  trip  to  a  series  of  places.  You  will  have  to  foresee 
certain  problems  and  plan  accordingly,  selecting  the  shortest  or  quickest  route 

To  help  you  locate  points  referred  to  in  the  questions,  each  place  is  given  a 
number.  Examine  the  map  and  note  that  these  numbers  are  arranged  in  such  a  way 
that  they  get  larger  from  left  to  right  and  from  top  to  bottom.  Note  the  ferry  (34) 
at  the  lower  right  of  the  map.  The  ferry  is  toll  free  and  runs  every  few  minutes 
except  where  otherwise  indicated. 

Now  work  this  sample  problem: 

You  arc  at  the  bank  (9).  You  want  to  stop  at  the  following  places  before  meeting 
a  friend  a*  the  school  (38).  Which  is  the  first  place  you  will  stop? 

A-  Shoe  shop  (11). 

B.  Docks  (33). 

C  Post  office  (6), 

D.  Bike  shop  (17). 

E  Yacht  club  (23). 

For  this  problem,  C  is  the  correct  answer.  The  first  stop  on  the  shortest  route  U 
the  post  office  (6). 

•  Developed  »t  P.ychotogi.al  Rtitirtli  Unit  No.  J.  Chief  contributory:  la.  Uvll  G.  C»r* 
penter,  Jr.,  la.  Dtvid  H.  fenkin*,  S|l.  Petty  J.  S*lk,  S/Sft.  \V *yn*  S.  Zimnuim**. 
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CI407BX 

(3)  Scoring. — The  scoring  formula  is  R— W/4. 

Statistical  results. — Data  arc  fairly  complete  except  for  validity  fig¬ 
ures.  The  samples  were  tested  at  Psychological  Research  Unit  No.  3. 

(1)  Distribution  statistics. — Means  and  standard  deviations  arc  given 
in  table  9.8. 

Table  9.8. —  Distribution  constants  for  Organizational  Planning,  Cl 407 AX'  and 


C1407BX,  based  on  samples  of  unclassified  aviation. students 


Fora 

N 

M 

SD 

CI407AX  .  . 

•200 

8.1 

3.9 

CI407UX  . 

•275 

16.7 

6.1 

*  Srr  —  for  a  dr-  lion  of  Ibis  variation. 

* Tr ilnl  in  Deermbrr  l,t2. 

*  Tc*tcd  in  May  1942  and  August  194}. 
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(2)  Internal  consist, uc  s\  Hu-  internal  consistency  of  items  in  this 

test  is  indicated  by  a  mean  phi  .  0  .1!  with  a  range  from  0.06  to  0.54 

and  a  standard  deviation  : ,f  0*  i  4  t.a  a  '  on  the  highest  25  percent  and 
the  lowest  25  pc  i cent  of  i  sample  J  2Cl)  unclassified  aviation  students 
tested  with  the  CT407AX  form  in  December  1942. 

(3)  Reliability  coefficients. — These  may  be  seen  in  table  9.9.  They 
are  somewhat  low,  though  form  B  is  apparently  an  improvement  over 
form  A. 

Table  9,9. —  Reliability  coefficients  for  Organisational  Planning,  CI407AX  and 
CI407BX  based  on  sanities  of  unclassified  aviation  students 


Form 

Method 

CN07AX  . 

P.irt  1  vs.  Part  II  .. 

CI407BX  . 

N 


*200 
*22  J 


0.29 

.SO 


0.4S 

.07 


•  tesica  m  ucc«nmcr  iv+t. 

*  Tested  in  May  1942  and  August  IV 13. 

(4)  Difficulty.—  The  difficulty  level  of  items  in  form  CI407AX  is 
indicated  by  the  mean  proportion  of  correct  responses  equal  to  0.34, 
with  a  standard  deviation  of  0  23  me!  a  range  of  0.00  to  0.88,  based  on 
the  above-mentioned  sample  of  2()0  unclassified  aviation  students. 

(5)  Factorial  cot)iposiiiu$i.-  -  The  hading  factor  loadings  of  fyrm 
CI407BX  are  in  the  num.uical  ^0  38),  integration  II  (0.35),  integra¬ 
tion  III  (0.28),  and  mechanic  J  r,( . ; ience  (0  20)  factors.  The  com- 
munality  (0.46)  is  sufficiently  slant  of  the  estimated  reliability  (0.67) 
for  this  form  to  suggest  room  for  ..(hot  common  factors.  For  a  fuller 
picture  of  the  factorial  composition  ot  this  test,  see  appendix  B. 

(6)  Test  validity --Ih'iig  a  samt  h  of  102  pilots  in  class  431  tested 
on  the  CI407AX  form,  a  biserial  co;. elation  of  0.25,  uncorrccted,  was 
obtained  against  the  critc.ion  of  graduation-elimination  from  primary 
pilot  training.  The  proportion  of  graduates  was  0.76,  the  mean  score  of 
the  graduates  8  82,  the  mean  scorn  of  the  eliminccs  7.12,  and  the  stand¬ 
ard  deviation  of  the  scores  of  all  was  3  96. 

FtWim/hm.- Orgasm  at  ww  l  Fl.mmr.g,  CI407AX, -has  an  uncorrected 
validity  biscrial  of  0.25.  has.  i  on  a  small  sample.  Tins  validity  estimate 
may  be  too  high,  since  (be  i!  is  sm.il  .nd  the  predicted  validity  based 
upon  what  is  known  of  the  test's  5arl,.,iul  composition  is  only  0.18  (see 
table  28.18),  but  on  the  w.hci  h.md  the  discrepancy  may  suggest  un¬ 
known  valid  variance  V-  nance  in  the  so-called  planning  factor  is  con¬ 
spicuously  absent,  indicating  that  tbc  u  .  t  is  misnamed.  Its  known  factors 
are  better  measured  bv  otKi  much  m  n  reliable  tests. 

Variations  of  the  t-sf  <  ;  nil  Tlnuning,  CI407AX,  contained 

one  unrecorded  and  ur.se  >.  d  r^Ucm  and  28  scored  items  di¬ 

vided  into  2  parts  of  Id  wnh  Aii  iu-ms  were  based  on  a  map  similar  to 
the  one  used  in  the  B\  f-.m  'I  i  time  limit  for  part  I  was  22  mnn.tr* 
and  for  part  IT.  18  mim  n  s  He  p;U  validity  bi. serial  for  CI407AX 
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(0.25)  prompted  the  development  of  CI407BX.  The  number  of  items 
was  increased  and  the  map  revised  to  appear  more  realistic. 

Planning  Air  Maneuvers,  CM08AX3  * 

This  is  the  third  form  of  the  third  test  in  the  economical  procedures 
subarea.  It  v.  as  designed  to  measure  the  ability  to  visualize  a  course  of 
action  and  to  plan  for  its  successful  completion.  The  maneuvers  in  the 
test,  as  is  often  true  of  maneuvers  in  training  or  combat,  must  be  made 
over  the  shortest,  simplest,  and  most  direct  path. 

Description. — This  test  assumes  that  the  examinee  is  a  sky-writing 
pilot  who  must  plan  how  to  write  two  adjacent  letters  by  flying  the 
shortest  possible  path.  The  starting  and  finishing  positions  of  the  plane 
arc  shown,  and  the  sharpest  turn  that  the  plane  can  make  is  indicated. 
With  this  information  and  the  large  letters  to  be  written  presented  in 
his  test  booklet,  the  examinee  must  select  the  correct  path  and  indicate 
the  direction  in  which  lie  is  traveling  at  each  indicated  point. 

(1)  Internal  characteristics. — The  test  contains  IS  recorded  but  un- 
scorcd  sample  items  and  87  scored  items. 

(2)  Administration. — Twenty  minutes  are  allowed  for  completing 
the  test.  After  10  minutes  have  elapsed,  the  examinees  are  informed  that 
10  minutes  remain. 

Following  arc  part  of  the  directions ;  the  sample  problem  used  in  the 
directions  is  shown  in  figures  9.7  and  9.8. 

This  is  a  test  of  your  ability  to  plan  air  maneuvers. 

Assume  that  you  are  a  sky-writing  pilot  and  must  plan  how  to  write  letter  com- 


FIGURE  9.7 

SAMPLE  PROBLEM  OP  PLANNING  AIR  MANEUVERS, 

Cl  408  AX  3 

*  Developed  at  Piyrholofif *1  Kctrjick  L'nil  .Vo.  J.  Ckirf  contributor:  S/Sft.  Wtjmo  S. 
Zinuuormon. 
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te  of  !’i ..  >  :  . 

>ccitle  \v!i.r  :  n  .> . 

Number  •  •  '■  w .  *i  ’  •  '  .  -  >.■ 

lute  that  luJ  -  ;  the  !.  i  ■>  . .  u  .  i . 
.roiig  licetiii  ••  .  I  ic  •!»'..!)<'  I  t!t. 
.■cause  tiit*  t  •  .V  .  l<.  :  .  ; ' 

i:r(i!cr.  at-..’  it.  e  *ij.  rt  .  ay  .  . .r. 


,  .  jblcms  that  follow,  you  arc  to  find 
.  order  to  do  each  problem  correctly 


|'  e  second. 

p'ane  is  in  the  position  labeled 

i  i.'iirc  sharply  than  is  shown  in  the 
i  •-.-am  and  note  the  sharpest  turn. 

ht  (fig.  9.7).  In  moving  through 
You  will  indicate  which  direction 
y  marking  either  A  or  B  oh  your 

..it  illustrated.  (Sir  fig.  9.8.)  Only 
j  ..1)  of  the  rules  set  forth  above. 

■'  :  shortest,  simplest,  and  most  direct 
fust  page.  Illustration  number  1  is 
i  fore  the  first.  Number  2  is  wrong 
wrong  because  there  is  a  sl»rterf 
. .  .ucuver. 


o 

iitVIONS  TO  SAMPLE 
ir  ,  s  MANEUVERS, 

Cl  IH.ij 
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Scoring. — The  scoring  formula  is  R  —  W. 

Variations. — Since  statistical  data  will  be  presented  for  three  forms 
o<  the  test,  it  is  desirable  to  describe  the  variations  here. 

Planning  Air  Maneuvers,  CI408AX1,  contains  3  recorded  but  un- 
scorcd  sample  items  and  74  scored  items  divided  into  2  parts  of  35  each. 
Fifteen  minutes  are  allowed  for  the  completion  of  each  part.  The  items 
in  the  AX1  form  range  greatly  in  difficulty  because  the  combinations 
varied  from  one  to  four  letters.  As  indicated  by  the  statistical  results, 
this  form  is  too  difficult  and  the  items  arc  not  highly  reliable.  An  effort 
to  correct  this  situation  was  made  in  the  revision,  CI408AX2.  This 
second  form  contains  5  recorded  but  unscored  sample  items  .and  114 
scored  items  divided  into  2  parts,  twenty  minutes  arc  allowed  for  the 
completion  of  each  part.  All  sky-writing  patterns  in  the  AX2  form  con¬ 
tain  two  letters,  in  contrast  to  the  varying  number  (1-4)  in  AX1. 
Furthermore,  only  the  letters  easiest  to  trace,  such  as  D,  F,  K.  P,  V,  N, 
A,  R,  Z,  I.,  II,  and  F  are  used.  Other  letters  from  the  AX1  form  that 
proved  more  difficult  were  dropped.  Statistical  data  indicate  that  the  re¬ 
vision  achieved  the  desired  effect. 

Tn  constructing  form  CI408AX3,  items  with  the  highest  internal- 
consistency  phis  were  taken  from  AX2  and  the  directions  clarified.  In 
form  CI40SBX1,  the  same  items  from  AX3  arc  used.  The  only  differ¬ 
ence  is  that  the  directions  were  deliberately  made  brief.  The  purpose  of 
the  revision  was  to  determine  the  effect  of  completeness  and  length  of 
directions  upon  the  functions  measured  by  the  test.  No  data  are  avail¬ 
able. 

Statistical  results. — Data  arc  available  for  the  three  forms  of  Plan¬ 
ning  Air  Maneuvers,  for  examinees  at  Psychological  Research  Unit 


No.  3. 

Taiu.e  9.10- — Distribution  data  for  AXl,  AX 2,  and  AX3  forms  of  Planning  Air 
Maneuvers  Test,  for  grout's  of  ctassif  ei  pilots 


Tot  form 

Number  of 
itfml 

N 

M 

SD 

AXl  . 

74 

'227 

70.1 

11.* 

AXl  . 

114 

>147 

19.7 

21.3 

AXl  . 

17 

•1.141 

11.6 

16.' 

•  TcmciI  in  1042.  CUst  not  identified. 

•  In  (iiti  4JK. 

•  In  cUi*  44  F. 


Taw.e  9  i ’  internal-consistency  data  for  Planning  Air  Maneuvers  Test  based  on 
groufs  of  unclassified  aviation  students 


Tot  form 

N 

M# 

SD# 

Rang*  of  #*• 

AXl  . 

*675 

0  35 

0  14 

-0.1*  t»  0.47 

A X3  . 

•67J 

so 

.11 

-.11  t*  64 

AXl  . 

•too 

.44 

17 

.11  t*  \*5 

'  Tr'tr*!  »«  llwtmWf  1042 
1  T<  fr.l  >n  Kfhtiiiff  Mif«U  194  S. 

*  Tr dcvl  in  Apnt  jnJ  May  1041. 

•  Hitet)  vn  tbe  l^ul  croup  for  fwk  :tem. 
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(  t  j  Distribution  statistics  M  c  table  9.10. 

'•  “ >  Internal  cons'-inu y  •  ..i,M>trnt  improvement  in  t<*st  homo* 
’<lV  f-hown  1,1  0.  1 1 ; >  i  >  .  i\c  fojnis.  Hu  « l.i f a  are  prrsnil,*,! 

■  i  v  9.1! 

'  hdu-l  ility  ur  \  t,h.!>:lit\  coelficient  of  0.7.1,  corrected, 

' 'J'taincti  from  a  •  i..t  ’  .  t  1(.|  unclassified  aviation  students  tested 
.it  .‘.larch  1943.  I  he  :  'tu  i.  m  wab  computed  for  Tart  I  v.  Part  II  of 
h..  AX 2  form. 

(i<  Difficulty. —  lli.  iii-  .  I  ■ l-ul  of  hems  in  tliis  test  is  indicated 
-he  data  in  table  9.t /, 

•  ’  r-  ^4  '  Difficulty  Ivlins  Oil,  i  ,t  for  ,  lumee  success  fur  Planning  Air  Man¬ 

euver!  ■■  v  i  ,;,i  ur.J.ruficd  aviation  students 


1  T.  !c-!  i:i  December  1942  . 

1  ■- * tc>1  'ii  Frlmury  anj  i  .lj. 

’  V-  'r:  Afiil  and  May  1913. 

(;)  I’cclorial  coniffouLiin.-  Tli  hading  factor  loadings  for  the  AX3 
f  ;  "  a. r->  in  the  plan. in;  (0  ti-)  integration  III  (0.43),  spatial-rcla- 
;i(  ns  ’0  32),  and  nieJun  d.xp-ii.  nee  (0.20)  factors.  The  common- 
'  •'  .’.•‘iance  is  pul.  h!^  cMuasi.  d,  with  a  coinnumality  of  0.69. 

i)  Test  validity.  •'S/ahdit}  data  are  available  for  ail  three  forms 
a.  me  nesented  in  table  9  13. 

i.  ?.  hi -- Validity  d  1U1  /,  •  three  for  ms  of  Planning  Air  Maneuvers,  Cl 403 A, 
vciil i  t>u  •  ■  .T.  iticn  elimination  criterion 


Croup 

Ojji 

ft  : 
i  cur. 

it. 

f. 

M, 

M, 

Sl», 

fM» 

/n«* 

in  Primary 
'*•>:  •  •  . 

l#  #  _ 

A  Xl 

2.'7 

0.81 

25.05 

11.94 

0.28 

•  •  • 

r  m.ir.iary 
•‘vX  . 

43K 

A  a  2 

147 

.87 

41. OS 

30.20 

23.17 

25 

•  »  . 

in  '  imary 

ng  . 

44A 

A  X3 

it  3 

’  .82 

26.  JO 

23.20 

IS. 30 

.11 

n  f-.tmary 

:  V.  . 

441' 

A\J 

l.i  ^  * 

.54 

32  5S 

29  42 

16.44 

.09 

IS 

baric 

<•: 

i 

\ 

mu 

i 

.90 

38.87 

16  46 

16.46 

i 

.14 

.20 

1  /.  *  *n  unrrsit  tv  if  J  t  - :  i..*-  •  ?  i  t  lard  Jcv  t  jtion  of  2.00, 

1  V  iwj  in  OtctmbcT  \')4  2.  \  *.  \  roc  :-l  .Allied. 

L:  ..illation — PI:,  m  m,  \ i ;  Maneuvers,  C140SAX3,  is  strongly  loaded 
e/itli  a  new  factor,  .'  h.ci*  ha  •  bum  difficult  to  define.  In  three  different 
an.'!;  v  its  loading  !  •-  1  .u.  0  51.  0.46,  and  0  33,  with  a  mean  of  0.46. 
A  '  .Vs  this  facte.  .  .  .....  u  ...  c...  junction  with  Planning  a  Circuit  and 
.  ;.  judgment  te_ t s.  i;  ,  ...i  „b'.e  that  this  factor  has  a  small  positive 

..b'i'y  for  pilots  a  J  A  t  it  .mil.  ihutrs  to  the  average  pilot  validity 

iS  obtained  for  >'m  Pum  f  me  Planning  Air  Maneuvers  test.  Its 

m  g  i.i  ii.tegiati.  .  i  i  t  i.  p  ' '  ably  a  handicap  in  relation  to  pilot 
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validity.  Its  pilot  validity  is  fully  accounted  for  by  known  factor  com¬ 
position,  assuming  a  validity  f  —0.25  for  integration  III,  which  de¬ 
tracts  0.10  from  the  total  estimate.  This  calls  for  steps  to  rid  the  test  of 
its  integration  III  variance. 

Sequence  of  Maneuvers,  CI410A  16 

This  is  another  test  in  the  economical  procedures  subarea.  It  attempts 
to  duplicate  on  paper  a  type  of  planning  that  many  pilots  must  execute 
in  the  course  of  their  training  and  operations. 

Description. — The  examinee  is  presented  with  diagrams  of  a  series 
of  five  maneuvers  involving  climbs  and  dives,  lie  must  take  into  ac¬ 
count  the  altitude  at  which  each  maneuver  must  be  done  and  the  amount 
of  altitude  lost  or  gained.  Then  he  must  plan  the  arrangement  for  exe¬ 
cuting  the  fsve  maneuvers  so  as  to  do  the  least  amount  of  unnecessary 
climbing  and  diving. 

(1)  Internal  characteristics. — The  test  contains  2  recorded  but  un¬ 
scored  sample  items  and  32  scored  items,  divided  into  2  parts  of  16 
each. 

(2;  Administration. — Twenty-seven  minutes  are  allowed  for  the 
completion  of  each  part.  Six  minutes  are  required  for  directions,  bring¬ 
ing  the  total  testing  time  to  60  minutes.  Following  are  the  directions 
and  sample  items.  Purely  oral  directions  are  given  entirely  in  italics. 

Suppose  you  have  had  several  hours  of  solo  flying  in  primary  flying  school  and 
your  instructor  tells  you  to  take  up  a  plan:  and  do  several  maneuvers.  Yens  vjould 
have  to  figure  out  in  what  order  you  would  do  the  maneuvers,  for  surely  you  would 
not  attempt  an  -i-fe  loop  just  after  having  completed  a  poivcr  dive  which  left  you 
quite  near  the  „  .  .rut,  would  you t  This  is  a  test  to  measure  just  such  an  ability, 
to  see  how  well  you  can  plan  air  *> ;  viewers  in  flying. 

Nosv  look  at  the  cover  sheet  c*  ;oi<r  booklet  and  read  the  directions  silently  as 
l  read  them  aloud. 

This  is  a  test  of  your  ability  to  plan  the  most  efficient  order  in  which  to  carry 
out  a  series  of  practice  air  maneuvers.  For  c  .ch  maneuver  in  a  series,  you  will  be 
told  the  altitude  lost  or  gained  while  perfo  ming  it.  You  will  also  be  told  when  a 
given  maneuver  should  be  carried  out  between  two  definite  altitudes,  or  when  it 
must  begin  or  end  above  or  below  a  particular  altitude.  For  each  series  of  five  ma¬ 
neuvers,  you  will  be  told  the  altitude  at  which  you  are  to  start  the  scries.  Your 
problem  is  to  figure  out  the  most  efficient  order  in  which  tc  perform  each  set  of 
maneuvers,  i.  c.,  the  order  that  involves  the  least  amount  of  unnecessary  climbing 
and  diving,  You  are  to  note  which  of  the  maneuvers  comes  fourth  in  your  arrange¬ 
ment,  and  indicate  your  answer  by  marking  in  the  space  on  your  answer  sheet  under 
the  letter  which  corrcsi>onds  to  the  maneuver  that  you  decide  should  come  fourth  in 
the  sequence.  You  may  climb  or  dive  before  the  first  maneuver  and  betw'-en  ma¬ 
neuvers,  but  you  should  change  altitude  as  little  as  possible  before  and  between 
maneuvers. 

Look  at  example  1.  See  figure  9.9. 

Noic  that  in  maneuver  A  the  plane  loses  3,000  feet  in  altitude  and  must  finish 
the  maneuver  at  above  2,000  feet.  The  maneuver  can  end  at  any  altitude  above 
2,000  feet  so  long  as  it  is  uegun  at  an  altitude  3,000  feet  higher.  Similarly,  ma- 

m  Developed  »t  P»ycholo|ic»l  Re*e»rch  Unit  No.  3.  Chief  contributor:  Li.  Vfsbloo  B.  Smith. 
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FIGURE  9.9 

AMPLE  PROBLEM  OF  SEQUENCE  OF  MANEUVERS, 

CI4I0A 

n.  mcr  13  involves  a  loss  of  2,000  feet  ami  nm.-t  be  completed  at  an  altitude  of  3,000 
or  above.  Maneuver  €  requires  a  500  foot  climb,  but  way  be  performed  al  any 
e.lv.lude.  In  maneuvers  D  and  E,  there  is  no  change  in  altitude  but  the  entire 
i.i.u:cuver  must  be  done  between  the  altitude  levels  indicated. 

Nozo  look  at  me.  (Administrator  should  pause  until  all  heads  are  lifted.)  At  this 
point  1  mil  give  you  some  additional  explanation  that  is  not  printed  in  your  test 
l  oklet.  Look  at  maneuver  D  in  example  1.  The  lines  at  3JP00  and  4,100  feet  mean 
this  maneuver  must  be  done  al  approximately  4ft00  feet.  Similarly,  the  lines 
.  mean  that  maneuver  E  must  be  done  a 1  approximately  2fi00  feet.  Just  to  the 
of  B  you  zeill  find  the  number  1  in  a  small  circle.  (Administrator  should  pause 
briefly  :o  allow  cadets  to  find  the  encircled  number.)  This  means  that  maneuver 
D  is  the  first  maneuver  to  be  done  when  all  the  maneuvers  are  arranged  in  the 
p,  0:  er  order. 

1  r no  zee  will  zoork  out  the  first  example  together.  Notice  that  at  the  top  of 
ex.:; epic  one  it  states  that  zee  must  start  at  5,000  feet.  We  already  knout  that  the 
firs,  maneuver  should  be  B.  This  brings  us  dozen  from  5J00G  feel  to  3/XX).  Which 
or.e  should  zee  do  next?  Look  over  the  four  maneuvers  that  art  left.  Notice  that 
/<  m.prres  the  highest  altitude,  for  it  also  miesl  be  begun  at  an  altitude  of  at  least 
.i ,0S-C  feet.  Maneuver  A  involves  the  greatest  loss  of  altitude  of  any  of  the 
maneuvers,  and  zeill  bring  us  in  position  to  perform  maneuver  E,  at  2j000  feet.  We 
must  therefore  get  from  3,000  feet,  where  maneuver  B  left  us,  to  the  5fi00  feel  re¬ 
quired  to  do  maneuver  A.  This  involves  a  climb  of  2,000  feet,  and  we  can  use  ma- 
neuver  C  for  part  of  this,  doing  C  on  the  tvay  up.  C  is  thus  our  second  maneuver, 
end  brings  us  to  3J>00  feet.  D  must  be  done  at  an  altitude  of  about  4JXX)  feel,  so  to* 
can  do  that  on  our  way  up  to  SpOO  feel  without  losing  any  attitude.  We  will  there¬ 
to;  e  climb  SCO  feet  more,  do  maneuver  D,  and  then  climb  another  thousand  feet  to 

. .  ■>  feet.  New  we  are  in  position  to  do  maneuver  A,  zohieh  brings  us  dozen  from 
:  0X>  feet  to  2,000  feet.  This  is  our  fourth  maneuver.  At  2 /XX)  feet  xve  can  perform 
::.a:..!cvcr  E,  our  fifth  maneuver,  -without  any  further  change  of  altitude.  We  selected 
.  tch  maneuver  with  the  aim  of  gelling  into  the  best  possible  position  to  perform  the 
rest  of  the  maneuver. 

Nozu  zee  zeill  continue  reading  the  directions  in  the  test  booklet. 

The  least  amount  of  extra  climbing  and  diving  is  therefore  involved  when  the 
maneuvers  are  performed  in  the  order  13,  C,  D,  A,  E.  A  is  the  correct  answer, 
i.  e.,  the  fourth  maneuver  in  the  proper  sequence.  The  diagrams  below  show  why 
this  is  the  best  arrangement.  Sec  figure  9.10. 

Diagram  I  shows  how  the  maneuvers  may  be  performed  most  efficiently,  as  wc 
hive  just  done  them,  while  diagram  II  shows  one  of  the  less  efficient,  incorrect 
solutions.  The  dotted  lines  represent  the  maneuvers,  which  are  labeled  with  the 
same  letters  that  they  had  in  the  first  diagram.  The  solid  lines  represent  change* 
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FIGURE  9.10 

CORRECT  AND  INCORRECT  SOLUTIONS  TO  SAMPLE 
PROBLEM  OF  SEQUENCE  OF  MANEUVERS,  CI4I0A 


>n  altitude  necessary  to  get  into  position  to  perform  maneuvers.  When  the  ma¬ 
neuvers  are  done  in  the  correct  order  shown  in  diagram  I,  a  500-foot  climb  between 
maneuvers  C  and  D  and  a  1000-foot  climb  between  maneuvers  D  and  A  are  neces¬ 
sary.  Any  other  order  would  require  more  climbing  or  diving.  For  example,  when 
the  maneuvers  are  done  in  the  order  A,  E,  C,  D,  B,  as  in  diagram  II,  a  1,500- 
foot  climb  between  maneuvers  C  and  D  and  a  1,000-foot  climb  between  maneuvers 
D  and  B  are  necessary.  The  second  arrangement  is  poor  because  maneuver  A, 
which  involves  the  greatest  loss  of  altitude,  is  performed  first  instead  of  B,  leaving 
the  plane  in  a  poor  position  to  perform  maneuvers  D  and  B,  both  of  which  require 
relatively  high  altitudes.  Maneuver  A  is  fourth  in  the  best  arrangement,  so  blacken 
the  space  A  after  number  1  on  your  answer  sheet 

(3)  Scoring. — The  scoring  formula  is  R— W/4. 

Statistical  results. — Data  are  limited  for  this  test,  but  are  sufficient  to 
afford  some  evaluation  of  it.  The  samples  upon  which  the  data  are 
based  were  tested  at  Psychological  Research  Unit  No.  3. 

( 1 )  Distribution  statistics. — The  distribution  of  scores  in  this  test  is 
indicated  by  a  mean  of  10.3  and  a  standard  deviation  of  5.6,  based  on 
a  sample  of  436  unclassified  aviation  students  tested  in  December  1942. 

(2)  Internal  consistency. — The  internal  consistency  of  items  in  this 
test  is  indicated  by  a  mean  phi  of  0.39,  with  a  range  from  0.10  to  0.57 
and  a  standard  deviation  of  0.11,  based  on  the  highest  27  percent  and 
the  lowest  27  percent  of  220  unclassified  aviation  students. 

(3)  Reliability  coefficient. — An  alternate-form  reliability  coefficient  of 
0.66,  corrected,  was  obtained  from  the  above-mentioned  sample  of  436 
unclassified  aviation  students. 

(4)  Difficulty. — The  difficulty  level  of  items  in  the  test  is  indicated 
by  a  mean  proportion  of  correct  responses  equal  to  0.32,  corrected  for 
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chance  success,  with  a  standard  deviation  of  0.12  and  a  range  from  0.02 
to  0.62,  based  on  a  sample  of  220  mu  lassificd  aviation  students. 

(5)  m.’  composition. — Tin  leading  factor  loadings  arc  in  the 
w  t o,u  (O.o9).  judgment  (0.38),  pluming  (0.35),  and  numerical  (0.30) 

•  actois.  Its  to.. animality  (0.59)  indicates  that  practically  all  of  its  non- 
error  variance  (0.66)  is  known. 

(6)  Test  i  oddity. — lor  a  sample  of  247  pilots  in  primary  training, 
originally  test'd  m  December  1942,  the  validity  coefficient  was  0.00.  The 
proportion  of  graduates  was  0.77,  the  means  of  graduates  and  cliniinccs 
were  9.62  and  .9.64,  respectively,  and  the  standard  deviation  of  all  was 

5.30. 

Evaluation  Sequence  of  Maneuvers,  CI410A,  is  not  considered 
suitable  for  administration  becayse  of  its  extremely  complicated  direc¬ 
tions.  Furtheunore,  the  reliability  coefficient  is  relatively  low,  0.66,  cor¬ 
rected,  which  is  even  more  serious  in  view  of  the  length  of  time  re¬ 
quired  administer  the  test  in  its  present  form.  The  obtained  validity 
for  pilots  in  a  small  sample  was  0.00,  and  from  its  factor  loadings  one 
would  not  expect  a  pilot  validity  gt  eater  than  0.10.  It  combines  two  fac¬ 
tors  valid  for  navigators — verbal  and  numerical,  but  the  validities  of  its 
other  two  factors  for  navigator  selection  are  unknown. 

A  TEST  OF  PLANNING  BY  DEDUCTION 

ft  is  reasonable  to  suppose  that  the  victor  in  aerial  combat  is  usually 
Fie  pilot  who  can  anticipate  his  opponent's  moves  and  then  plan  his 
>AVi!  maneuvers  accordingly.  In  such  planning,  the  pilot  is  aware  of  cer¬ 
tain  general  factors  that  govern  his  action;  such  factors,  for  example, 
rw  the  position  of  clouds  and  the  limitations  of  his  own  and  of  his 
opponent’s  airplane.  Thus,  from  observation  of  the  situation,  the  pilot 
must  plan  by  deduction  what  Ins  opponent  will  probably  do,  and  then 
what  he  can  do  to  gain  advantage.  The  attempt  was  made  to  embody 
mis  deductive  aspect  of  planning  in  a  test  called  competitive  planning. 

Competitive  Planning,  CI409AX2  u 

This  is  the  final  form  of  the  only  test  in  the  subarea  planning  by 
deduction. 

Dcst  cipi ion  -This  test  is  based  or.  the  familiar  Complelion-of-Squares 
-  nine,  sometimes  called  “Squares”  or  "Boxes.”  In  the  test,  the  examinee 
must  plan  moves  for  both  opponents,  so  that  each  completes  as  many 
squares  as  possible  in  a  rectangular  diagram  of  incomplete  square  fig¬ 
ures.  In  order  to  solve  the  problems  correctly,  the  examinee  must  antici¬ 
pate  the  la  st  moves  for  each  opponent.  The  most  attractive  immediate 
move  is  m»i  always  the  best  move.  It  was  felt  that  it  would  be  desirable 
m  at  hast  .me  foresiglii-and-plaiuiing  test  to  provide  an  opportunity 
for  tli  ex.nmmc  to  refuse  immediate  gains  in  favor  of  later  benefits. 

**  Develi.-j.iil  -t  1'. i etiological  Keicarci  V  •  No.  J.  Chief  contributor*:  S/S*L  J.  Cordon 
f.cUn,  »rr*  f  t  l.ii.ii  Jliilchinjoii. 
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(1)  Internal  characteristics. — The  test  contains  1  unrecorded  and  un- 
scorcd  sample  problem,  2  recorded  but  unscored  practice  problems,  and 
40  scored  items  divided  into  two  parts  of  20  each.  Each  diagram  is  pre¬ 
sented  in  duplicate  so  that  the  examinee  may  try  a  second  solution  with¬ 
out  erasing  his  first  attempts. 

(2)  Administration. — Solutions  to  problems  are  worked  out  by 
marking  lines  or  completing  squares  directly  on  the  work  booklet.  These 
solutions  are  then  entered  on  the  standard  five-place  IBM  answer  sheet. 
Seventeen  minutes  are  allowed  for  the  completion  of  each  part. 

Following  arc  parts  of  the  directions  contained  in  thg  test  booklet. 
The  practice  diagrams  referred  to  appear  in  the  work  booklet  along  with 
the  scored  items  of  the  test.  They  are  shown  in  figure  9.11. 


This  is  a  test  to  see  how  well  you  can  plan  moves  ir.  a  competitive  situation. 

Examine  the  diagrams  on  page  I  of  the  work  book.  Two  contestants,  Black  and 
White,  took  turns  filling  in  the  sides  of  incomplcted  squares  in  patterns  similar  to 
those  shown  on  page  1  of  the  work  book.  Each  of  the  contestants  always  made  the 
best  possible  moves  for  himself.  Your  task  will  be  to  reconstruct  the  moves  made  by 
the  two  contestants. 

The  ndcs  were  as  follows — read  them  carefully: 
a.  Black  always  made  the  first  move,  filling  in  one  side  of  an  incompletcd 
square. 


ANSWER 

LEGEND 

A-  BLACK  0 

WHITE 

B-  BLACK  1 

WHITE 

C-  BLACK  2 

WHITE 

0- BLACK  3 

WHITE 

E-  BLACK  4 

WHITE 

SAMPLE 

PROBLEM 

PRACTICE  PROBLEMS 


,  0  - - 

1  < 

*  O  ■  < 

’  2 

’  1 

1  < 

1  1 

1  ( 1 

V 

i  T  «  V 

I  O  1 

>  i 

1  < 

>  i 

*  1 

>  1 

W 

►  — o - ■ 

> 

- 0 - 

1  > - ■ 

1  o  ■  - 

»  1 

FIGURE  9.11 

SAMPLE  ANO  PRACTICE  PROBLEMS  OF  COMPETITIVE 
PLANNING,  CI409AX2 
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6.  Each  time  Black  or  White  completed  a  square,  he  had  to,  make  one  ad¬ 
ditional  move.  A  sqm ic  is  completed  when  the  fourth  side  is  filled  in. 
c.  Each  opponent  completed  the  greatest  possible  number  of  squares  in  the 
finished  pattern. 

are  to  work  each  psobhm  making  only  those  moves  which,  at  the  end  of 
pioblem,  give  each  competitor  the  highest  possible  scores  despite  the  best 
•  :s  of  the  other.  Your  answer  is  tlo  number  of  squares  each  contestant  com* 
r',<  c  l  in  the  finished  pattern.  If  you  fail  t->  select  the  best  possible  moves  for  both 
W  lie  opponents,  you  will  not  get  the  correct  answers. 

Von  may  mark  in  the  work  book  in  order  to  solve  the  problem.  Each  diagram 
•;  ,-ivcn  in  duplicate  so  that  you  may  try  a  second  solution  without  erasing, 
m  w  study  the  sample  problem  on  page  1  of  the  work  book, 
faking  the  best  moves  possible,  Black  and  White  completed  this  sanrptc  problem 
■  .he  following  manner:  As  always,  the  first  move  was  made  by  Black.  No  matter 
i  ich  side  Black  filled  in,  White  was  able  to  complete  two  squares  immediately. 

.■  completing  his  second  square,  he;  White,  was  compelled,  by  the  titles,  to  fill 
side  on  one  of  the  squares  in  the  other  half  of  the  pattern.  This  enabled  Black 
unplcte  the  remaining  two  squares  so  that  the  final  result  became  two  squares 
.  Black  and  two  squares  for  White.  Since  this  result  is  listed  opposite  choice 
in  the  list  of  alternate  answers  at  the  top  of  the  page,  the  correct  answer  to 
iroblem  would  be  C. 

’3)  Scoring. — The  scoring  formula  is  R— W/4. 
rails  tied  results. — The  data  available  arc  for  samples  tested  in  April 
May  1943,  at  Psychological  Research  Unit  No.  3. 

1)  Distribution  statistics. — Means  and  standard  deviations  yielded 
.  ,vo  samples  arc  shown  in  table  9.14. 


' 'abi.e  9.14. —  Distribution  constants  for  Competitive  Planning,  CN09AX2 


Group 

N 

M 

SD 

hssified  aviation  students  . . . 

422 

21.4 

2.4 

■ilWd  pilots1  . . . 

682 

20.1 

classes  441)  and  44E. 


’)  Internal  consistency — The  internal  consistency  of  items  in  this 
- !  ;s  indicated  by  a  mean  phi  of  0.38  with  a  range  from  —0.17  to  0.80 
;i  a.  standard  deviation  of  0.21,  based  on  the  highest  27  percent  and 
,  invest  27  percent  of  422  unclassified  students. 

)  Reliability  coefficient. — A  reliability  coefficient  of  0.68,  corrected, 

,  obtained  by  the  part  I  part  II  method  on  a  sample  of  422  unclassi- 
i.i.  d  students. 

(  ;)  Difficulty.— The  difficulty  level  of  items  in  the  test  is  indicated 
i;  die  mean  proportion  of  correct  responses  equal  to  0.51,  corrected 
h  v  chance  success,  with  a  standard  deviation  of  0.25  and  a  range  from 
0  .)•;)  to  0.96,  based  upon  the  above-mentioned  sample  of  422  cases. 

^S)  Factorial  composition. — The  chief  loadings  are  iti  these  factors. 
K ral-rcasoning  (0.36),  judgment  (0.36),  and  integration  III  (0.33). 
ih  a  slight  contribution  from  visualization  (0.19).  The  eouimunalUv 
^  '.  i8)  falls  short  of  the  reliability  (0.68). 


179 


(6)  Test  validity, — Using  a  sample  of  682  pilots  in  classes  44D  and 
44K,  a  biscrial  correlation  of  0.19,  corrected,  was  obtained  with  the  cri¬ 
terion  of  graduation-elimination  from  primary  training.  The  propor¬ 
tion  of  graduates  was  0.92,  the  mean  score  of  the  graduates  20.26,  the 
mean  score  of  diminccs  18.22,  and  the  standard  deviation  of  all  scores 
was  6.39. 

Evaluation. — Competitive  planning  has  a  validity  (0.19)  for  pilots 
that  is  iargcly  unaccounted  for  by  ^actors  of  known  pilot  validity.  In 
fact,  the  discrepancy  between  predicted  validity  (0.05)  and  the  obtained 
is  so  great  as  to  justify  search  for  the  unknown  valid  components.  The 
test  is  probably  handicapped  by  its  variance  in  integration  III  and  should 
Ik*  freed  from  that  element.  The  general- reasoning  factor  also  contributes 
excess  variance,  which  could  well  be  dispensed  with  so  far  as  pilot  selec¬ 
tion  is  concerned. 

Variations — The  CI409AX1  form  of  competitive  planning  contained 
only  20  items  divided  into  two  parts  of  ten  items  each.  Directions  and 
problems  were  consolidated  into  one  booklet.  Some  items  in  this  earlier 
form  were  considerably  more  difficult  than  those  in  the  later  revisit.?. 
Whereas  the  X2  items  never  exceed  four  squares,  the  XI  items  were 
graduated  in  difficulty  from  two  to  nine  squares.  It  was  believed  that 
difficulty  was  entirely  a  function  of  the  number  of  squares.  Total  test¬ 
ing  time  required  for  this  preliminary  form  was  40  minutes  (including 
directions).  The  XI  form  had  low  reliability  (part  I  v.  part  II  reli¬ 
ability  corrected  was  only  0.28  on  a  sample  of  200  unclassified  students 
tested  in  December  1942). 

A  FACTOR  ANALYSIS  OF  FORESIGHT  AND 
PLANNING  TESTS  11 

Analyses  were  made  of  two  special  batteries  of  foresight  and  plan¬ 
ning  tests  11  in  order  to  try  to  understand  better  their  fundamental  vari¬ 
ances  and  to  test  the  hypothesis  that  there  arc  such  fundamental  human 
abilities  .as  foresight  and  planning  or  a  single  factor  underlying  the  two. 

The  Data 

The  two  batteries  include  a  small  number  of  planning  tests  plus  a  few 
tests  selected  from  the  classification  battery  because  of  their  recognized 
reference  value,  plus  some  experimental  tests  in  the  areas  of  reasoning 
ami  judgment.  One  basis  for  the  inclusion  of  judgment  tests  was  that  ir* 
certain  judgment  items  which  contain  problems  of  a  work-planning  sort 
there  seemed  to  Ik*  a  unique  variable.  All  printed  tests  involved  in  the 
analyses  are  described  in  this  chapter  or  elsewhere  in  this  volume.  The 
one  j.sychomotor  test — Complex  Coordination — is  described  briefly  on 
p.  122  and  more  completely  in  Rejiort  No.  4. 

u  Executed  by  S/Sirt.  J.  Cordon  Elkin,  S/Sgt.  Benjamin  Frucliter,  Cap*.  Uoyd  C. 
Humphrey*,  L».  I)»vid  H.  Jenl.ni»,  S*t.  Harold  H.  Singer,  and  S/Sgt.  Wayne  S.  Zimmerman 
ai  I’.vc hological  He. earth  Unit  No.  1. 

11  Hereafter  tailed  planning  lejla,  for  tonvfi»|tn<«. 
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Table  9.15. —  Correlation  matrix  for  the  foresight  and  planning  battery  I  (N—202) 
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The  first  correlation  matrix  (table  9.15)  is  based  upon  202  unclassi¬ 
fied  aviation  students,  and  the  second  matrix  (table  9.18)  upon  170 
classified  pilots.  In  spite  of  their  classification,  the  ranee  of  ability  for 
these  pilots  was  probably  not  significantly  restricted  except  perhaps  on 
the  Complex  Coordination  test,  which  shows  slightly  reduced  factor  load¬ 
ings  in  this  sample  as  compared  with  the  first. 

The  two  sets  of  centroid  loadings  and  communalities  arc  given  in 
tables  9.16  and  9.19  and  the  rotated  loadings  in  tables  9.17  and  9.20. 

The  Factors 

Since  most  factors  presented  here  are  found  in  both  analyses,  parallel 
results  will  be  given.  Only  loadings  regarded  as  probably  significant  will 
usually  be  mentioned.  The  criterion  of  significance  is  arbitrarily  taken 
to  be  loadings  above  0.20  in  both  analyses,  or  above  0.25  in  at  least  one 
analysis. 


Rotated  factor  I  is  defined  by  the  following  data: 


Test  numbers 

Test  name 

Loadings 

I 

II 

piUsi 

ii 

IS 

2 

14 

I 

1  ■vH 

8 

Purtuit  . 

7 

13 

■Ms 

*  A  dash  in  these  tables  indicates  the  fact  that  this  test  was  not  present  in  this  analysis. 


This  is  clearly  the  perceptual-speed  factor  which  always  comes  out 
clearly  when  the  first  two  tests  in  the  list  arc  present  in  the  same 
analysis,  and  the  loadings  in  those  two  tests  arc  very  stable.  The  pres¬ 
ence  here  of  the  Pursuit  test  and  Planning  a  Circuit  test  with  such  strong 
loadings  is  a  little  surprising  and  gives  reason  to  modify  former  con¬ 
ceptions  of  this  factor.  The  two  tests  arc  clearly  similar  to  Map  Plan¬ 
ning  in  that  all  of  them  involve  perception  of  mazc-Iikc  patterns.  Clarity 
of  visual  form  may  consequently  have  to  be  added  as  an  aspect  of  this 
factor. 


Rotated  factor  II  is  defined  by  the  following  data: 


Test  numbers 

Test  name 

Loadings 

I 

II 

I 

11 

j, 

6 

Numerical  Operations  ....  . 

0.63 

13 

s 

Mathematics  D  . . . 

0.41 

.65 

3 

OrKaniration.il  Planning  . . . . 

.41 

•  •  •  • 

4 

Sequence  of  Maneuver! . 

.30 

•  *  »  • 

14 

2 

Snatial  Orientation  I  . 

.24 

.27 

11 

I'lanning  A  Course . 

.28 

19 

I'raetical  E.timation*  If  . . 

o  ♦  «  - 

.28 

-- 

3 

Technical  Vocabulary  (N)  . 

.... 

.28 

This  is  the  numerical  factor.  It  is  interesting  to  sec  how  this  factor 
creeps  into  a  variety  of  tests.  Organizational  Planning  involves  numbers 
only  as  symbols  of  stations  in  a  map.  The  stations  are  numbered  sys¬ 
tematically,  so  it  is  possible  that  arithmetical  computations  could  have 
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entered  into  the  solution  of  the  most  economical  paths.  Sequence  of 
Maneuvers  involves  frequent  ariihnu tical  problems  in  computation  of 
.nlitudc  changes.  lu.hniv.  |  \  ...  alntl.uy  (N)  probably  reflects  the  nu¬ 
merical  factor  indirectly  due  to  its  coverage  of  mathematical  interest 
and  training.  It  is  not  so  easy  to  see  numerical  work  involved  in  the 
other  tests  except  in  the  coding  of  items  and  responses. 


Rotated  factor  III  is  defined  by  the  following  data: 


Ttil  number* 

Loading* 

I 

11 

Tot  name 

I 

11 

13 

S 

S 

Mathematic*  B  ....  . 

Spatial  Reasoning  . . 

0.48 

0.50 

< 

10 

Planning  Air  Maneuver! . 

in 

llll 

3 

Organizational  Planning  . 

.29 

— 

14 

Competitive  Planning  . 

M 

12 

4 

Reading  Comprehension  . . . 

.2J 
•  •  •  • 

.19 

.40 

— 

17 

13 

12 

Practical  Judgment  11  .  . . 

_ 

Map  running  . . . . *»,»*,«• 

Route  Planning  . .  . 

Hit 

•31 

w 

4 

Sequence  ol  Maneuver!  . 

.21 

•  •  •  • 

This  is  a  general-reasoning  factor  consistently  strong  in  Mathematics 
B  (Arithmetic  Reasoning).  It  is  called  general  because  it  is  common  to 
more  tests  than  either  of  two  other  factors  that  are  peculiar  to  reasoning 
tests.  It  can  be  seen  that  most  of  the  planning  tests  have  some  small  but 
probably  significant  loadings  in  this  factor. 


Rotated  factor  IV  is  defined  by  the  following  data : 


Teat  number* 

Teal  name 

Loading* 

I 

11 

1 

11 

14 

S 

| 

9 

Complex  Coordination  . . 

Spatial  Kcaaoning  . . 

0.S4 

.33 

0.40 

•  ••• 

7 

Planning  A  Circuit  . 

.28 

•  •  •  • 

« 

10 

Planning  Air  Maneuver*  . . . 

.23 

.28 

11 

Planning  A  Court*  . 

.42 

— 

IJ 

Map  Planning . . . . . 

•  0  0  0 

_ ” 

This  is  the  factor  frequently  found  with  stable  loadings  in  the  Com¬ 
plex  Coordination  test  and  is  called  spatial  relations.  It  is  found  with 
greatest  loadings  in  tests  in  which  either  the  stimuli  or  responses  have 
spatial  arrangements — right-left,  up  down,  forward-backward — or  both. 
Other  tests  strongly  loaded  with  it  are  the  Discrimination  Reaction  Time 
test  and  the  Two-Hand  Coordination  test  (see  Report  No.  4  for  descrip¬ 
tion  of  these  tests).  The  loading  of  0.62  in  Planning  a  Course  is  probably 
spuriously  high  since  in  another  analysis  the  same  loading  is  only  0.34 
(see  p.  224).  Discrepancies  as  large  as  this  are  rare  in  factorial  results. 
It  can  possibly  be  attributed  to  sampling  errors. 

Rotated  factor  V  is  defined  by  the  following  data.  Nonsignificant 
loadings  are  reported  for  this  factor  in  planning  tests  because  this  factor 
is  of  special  interest  in  that  connection. 
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Tut  number* 

Test  name 

Loading* 

I 

11 

I 

11 

II 

I 

Mechanical  Principle*  . . . . 

0.S0 

0.4S 

9 

Driving  Skill  . . 

.42 

2 

Non-mechanical  Judgment  . 

.30 

1 6 

9 

Complex  Coordination  . . . . . . 

.28 

.09 

14 

Competitive  Planning  . 

.32 

17 

Practical  Judgment  11 . 

.29 

13 

Map  Planning  . . 

.28 

12 

Koutc  Planning . 

•  •  •  • 

.24 

_ r 

11 

Planning  A  Course . 

.1* 

6 

10 

Planning  Air  Maneuveri  . . 

.10 

-.04 

1 

Planning  A  Circuit  . 

.03 

•  •  •  • 

3 

__ 

Organisational  Planning . 

.00 

•  •  •  * 

4 

— 

Sequence  of  Maneuver* . 

.00 

•  •  M 

This  is  the  visualization  factor  which  apparently  entails  the  manipu¬ 
lation  of  visual  symbols.  One  might  expect  planning  of  various  kinds 
to  depend  heavily  upon  some  type  of  visualization,  but  except  for  small 
loadings  in  Competitive  Planning  and  three  other  planning  tests,  this 
seems  not  to  be  true.  This  may  be  taken  to  mean  that  this  factor  merely 
involves  a  very  simple  transfoi  .nation  of  some  perceived  or  imagined 
pattern.  It  apparently  does  not  serve  in  creative  thinking  but  does  seem 
to  help  arrive  at  facts. 


Rotated  factor  VI  is  defined  by  the  following  data : 


Te*t  number* 

Tut  name 

Loading* 

1 

11 

I 

11 

12 

4 

Heading  Comprehension  . 

0.65 

0.69 

4 

Sequence  of  Maneuver* . . 

.39 

5 

Spatial  Reasoning . 

.34 

•  •  •  • 

1) 

5 

Mathematic*  U  . 

.32 

.27 

6 

10 

Planning  Air  Maneuver* . 

.26 

.26 

I! 

• 

Mechanical  Principles  . . . 

.20 

.26 

16 

Practical  Judgment  I . 

•  •  #  • 

.30 

This  is  the  very  well  known  verbal  factor.  Of  all  planning  tests  it  is 
found  to  a  moderate  degree  only  in  the  Sequence  of  Maneuvers  test 
This  test  is  distinct  among  planning  tests  for  its  unusually  long  and 
involved  verbal  instructions.  A  similar  explanation  cannot  well  be  given 
for  the  loading  of  0.26  in  Planning  Air  Maneuvers,  however,  for  its  in¬ 
structions  are  fairly  simple  and  straightforward.  Verbal  comprehension 
must,  therefore,  enter  into  .  .esc  two  tests  in  some  other  manner,  or  the 
loading  in  Planning  Air  Maneuvers  is  perhaps  spurious 


Rotated  factor  VII  is  defined  by  the  following  data; 


Tut  number* 

Tut  name 

Loading* 

1 

11 

I 

H 

10 

7 

Mechanical  Information  . 

0.77 

0.74 

11 

• 

Mechanical  Principtu  . 

.61 

.64 

1 

9 

Mechanical  Judgment  . 

Driving  Skill  . 

.54 

.46 

•  •  *  • 

12 

4 

Heading  Comprehension  . . 

.39 

.49 

16 

9 

Complex  Coordination  . . 

.26 

.30 

« 

10 

Planning  Air  Maneuver*  . 

.17 

.31 

17 

Practical  Judgment  11 . 

.32 

wmm 

II 

Practical  intimation*  !  . 

.33 

— 

19 

Practical  Estimations  II  . 

•  •  •  • 

.32 

The  mechanical-experience  factor  is  here  shown  to  be  an  element  in 
only  one  planning  test,  Planning  Air  Maneuvers,  and  even  in  that  it  is 
rather  weak.  In  all  others,  'its  variance  is  zero  or  insignificant 
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Rotated  (actor  VIII  is  defined  by  the  following  data: 


Test  number* 


H 


17 
14 

18 
16 


Test  name 


Ncmticcbanicil  Judgment 
Sequence  of  Maneuver* 
Mechanical  Judgment  .. 
Practical  Judgment  II 
Competitive  Planning  . . 
Practical  Estimation*  I  . 
Practical  Judgment  I  . .. 


Lording* 


°:IS 

.36 

•  •  •  • 
•  •  •  • 
•  •  •  * 
•  •  •  * 


II 


•  •  «  • 
Hi 

.36 

.36 

-JS 


With  the  four  judgment  tests  all  having  equivalent  loadings  on  this 
factor,  although  no  test  is  in  both  batteries,  the  identity  of  the  factors 
in  the  two  analyses  could  hardly  be  mistaken.  As  a  matter  of  fact,  many 
items  in  the  mechanical  and  nonmcchanical  judgment  tests  in  the  first 
analysis  were  identical  with  items  in  Judgment  I  and  II  in  the  second 
analysis.  The  best  hypothesis  for  this  factor  is  judgment — the  ability 
to  weigh  solutions  and  select  tfiewiscst  and  best  one.  This  interpretation 
fits  Sequence  of  Maneuvers  ana  Competitive  Planning  very  well.  Why 
Practical  Estimations  II  is  not  present  in  the  list  is  unexplainable.  This 
factor  was  also  found  in  the  analysis  of  judgment  and  reasoning  tests 
(see  p.  152). 


Rotated  factor  IX  is  defined  by  the  following  data: 


Test  numbers 

Test  name 

log* 

I 

ii 

* 

11 

0.51 

0.44 

# 

10 

Is* 

.35 

7 

% 

4 

.47 

lo 

®  o  0  t 

.36 

•  •  •  0 

.31 

— • 

is 

.30 

7 

10 

'at 

The  only  common  ties  for  these  two  lists  of  loadings  are  the  ones  for 
Planning  Air  Maneuvers  and  for  the  two  judgment  tests.  To  call  these 
separate  factors  would  inflate  the  over-all  communality  of  Planning  Air 
Maneuvers  to  an  untenable  level.  Its  communality  is  071  in  the  second 
analysis,  which  comes  very  close  to  its  estimated  reliability1  (073). 

This  factor  cannot  be  satisfactorily  interpreted  at  present.  One  hy¬ 
pothesis  might  be  that  it  represents  an  ideational  fluency ;  the  man  who  can 
think  of  more  solutions  per  unit  of  time  would  have  an  advantage  in 
some  of  these  tests.  Planning  a  Circuit  does  not  fit  this  hypothesis  very 
well,  however,  nor  do  some  of  the  estimation  tests.  Another  hypothesis 
might  be  that  this  is  some  form  of  visualization  different  from  the 
manipulation  type  (factor  V).  This  idea  fits  most  of  the  tests  tut  lacks 
fully  convincing  evidence.  Why  other  planning  tests  do  not  also  require 
the  same  type  of  visualization  is  hard  to  understand.  Only  further  data 
will  more  clearly  define  this  factor.  It  had  best  be  left  with  the  general 
name  of  planning  until  more  definitive  evidence  is  available. 
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Rotated  factor  X  appears  only  in  the  second  analysis: 


Test  numbers 

Test  name 

Loadings 

I 

II 

I 

II 

- 

10 

Hanning  Air  Maneuvers  . 

»  • 

0.44 

- i 

12 

Route  Mannin-  . 

•  • 

.39 

14 

Competitive  Hanning  . 

•  * 

.33 

— 

II 

Hanning  A  Course . . . 

•  * 

.33 

This  seems  to  be  identifiable  with  the  factor  called  integration  III, 
to  be  described  in  chapter  10. 

It  is  curious  that  the  only  factor  that  has  so  many  of  the  planning 
tests  in  common,  and  only  planning  tests  to  any  significant  degree,  should 
not  be  called  "planning,”  that  name  being  given  to  another  factor.  In  the 
integration-battery  analysis  at  least,  five  nonplanning  tests  have  appar¬ 
ently  significant  loadings  on  the  same  factor.  It  might,  after  all,  be  a 
second  planning  factor.  The  term  "integration  III”  merely  arises  from 
the  fact  that  it  was  discovered  in  the  integration  battery. 

General  Conclusions 

In  conclusion  it  can  be  said  of  planning  tests  that  their  fundamental 
variances  break  down  along  different  lines,  some  assignable  to  already 
well-known  factors,  and  some  to  new  unidentified  factors.  No  planning 
test  in  the  list  was  found  to  be  satisfactorily  pure.  Most  of  them  have 
significant,  though  rather  small,  loadings  in  general  reasoning.  Given 
in  order  of  their  loadings  in  this  factor  they  are:  Map  Planning,  Plan¬ 
ning  Air  Maneuvers,  Competitive  Planning,  Organizational  Planning, 
Route  Planning,  and  Sequence  of  Maneuvers.  None  of  them  can  be 
recommended,  however,  as  a  general  reasoning  test.  All  except  Organiza¬ 
tional  Planning  and  Map  Planning  have  probably  significant  loadings  in 
the  unidentified  factor,  which  may  be  ideational  fluency  or  a  creative 
visualization  rather  than  planning  as  such.  The  strongest  tests  in  this 
factor  are,  in  order:  Planning  Air  Maneuvers,  Planning  A  Circuit,  and 
Sequence  of  Maneuvers.  Map  Planning  and  Planning  A  Circuit  have 
strong  variances  in  perceptual  speed.  Organizational  Planning,  Sequence 
of  Maneuvers,  and  Planning  A  Course  have  moderate  to  low  loadings  in 
numerical  facility.  As  a  negative  conclusion,  it  can  be  said  that  planning 
tests  are  not  mechanical,  not  visualization  tests  (of  the  manipulatory 
variety),  nor  are  they  verbal  (except  for  Sequence  of  Maneuvers). 

When  factor  loadings  arc  considered  in  connection  with  arbitrary 
groupings  of  this  chapter,  it  will  be  seen  that  there  i?  not  much  support¬ 
ing  evidence  for  that  type  of  categorization.  Two  of  the  three  econom¬ 
ical-procedures  tests  are  leaders  with  variance  in  the  new  planning  fac¬ 
tor.  The  third,  however,  is  decidedly  missing  from  the  list  and  a  test  not 
in  the  list— Planning  A  Circuit— it  prominent  We  cannot,  therefore,  call 
the  new  variable  an  economical-procedures  factor. 

AH  in  all,  these  analyses  have  failed  to  demonstrate  a  clear-cut  funda¬ 
mental  ability  that  should  be  called  either  foresight  or  planning,  and  while 
two  new  interesting  factors  have  been  uncovered,  no  relatively  pure  test 
for  either  of  them  has  as  yet  been  found. 
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CHAPTER  Til 


Integration  Tests1 


INTRODUCTION 

job-Analysh  Finding* 

Inability  to  pay  attention  to  numerous  conditions  while  engaged  in 
some  phase  of  Hying  activity  and  to  construct  an  integrated  impression 
of  these  conditions  quickly  and  appropriately  seemed  to  be  the  common 
element  of  a  variety  of  stated  reasons  for  eliminations  in  primary  pilot 
schools.  “Unable  to  think  of  more  than  one  thing  at  a  time,"  "frequently 
becomes  confused,"  "suffers  from  indecision,"  “cannot  divide  attention” 
— these  are  typical  comments  made  by  instructors  regarding  failing  stu¬ 
dents  who  are  probably  deficient  in  the  ability  to  integrate. 

In  a  faculty-board  account  of  reasons  for  elimination  of  102  bombar¬ 
dier  students,  the  lack  of  "ability  to  execute  a  series  of  activities  accu¬ 
rately  and  in  proper  order"  was  mentioned  as  a  deficiency  in  70  percent 
of  the  cases  by  instructors  and  cheek-fiight  bombardiers.* 

Analysis  of  the  performance  required  of  the  pilot  trainee  in  primary 
school,  or  of  the  navigator  or  bombardier  in  advanced  school,  suggests 
that  the  successful  air-crew  member  must  maintain  sets  to  respond  to  a 
large  number  of  conditions,  cues,  and  reference  points.  Often  these  con¬ 
ditions  must  be  observed  simultaneously  or  at  least  within  a  brief  period 
of  time.  Moreover,  they  must  often  be  noted  while  some  other  activity  is 
lieing  carried  out,  thus  making  it  necessary  to  divide  attention  without 
disrupting  the  pattern  of  action  in  progress.  Some  cues,  when  they  oc¬ 
cur,  call  for  immediate  action;  others  call  for  delayed  responses  with 
which  there  must  be  no  interference  by  intervening  activities.  In  order 
:o  respond  appropriately,  the  various  conditions  that  influence  action  at 
a  given  time  must  be  observed,  remembered,  and  integrated. 

An  illustration  of  these  requirements  is  seen  in  the  pilot’s  choice  of 
fields  during  forced-landing  practice.  The  pilot  must  make  most  of  his 
observations  while  establishing  and  maintaining  the  proper  glide.  Certain 
conditions  that  will  determine  the  field  chosen  must  be  noted  quickly 
while  other  conditions  must  be  remembered  from  picvious  observations. 
Among  the  numerous  things  requiring  consideration  are  (1)  direction 
of  the  wind;  (2)  altitude  of  the  plane;  (3)  relative  distances  from 
available  fields;  (4)  surface  characteristics  of  available  fields;  and  (5) 
hazards  to  the  approach  of  available  fields.  Students  able  to  make  an 
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appropriate  judgment  in  relation  to  any  one  of  the  above  conditions  pre¬ 
sented  singly  may  find  it  very  difficult  to  make  a  successful  integrated 
response  in  the  presence  of  a  number  of  them. 

Requirements  of  an  Integration  Test 

In  constructing  a  test  for  this  supposed  function,  the  following  con¬ 
siderations  were  observed : 

o.  The  difficulty  of  learning  any  material  should  be  kept  at  a  mini¬ 
mum.  If  possible,  the  test  score  should  be  unaffected  by  differences  in 
learning  ability. 

b.  The  signs  to  which  the  examinee  responds  should  be  presented  v(l) 
preferably  during  the  conduct  of  some  activity  and  (2)  in  such  a  manner 
that  a  number  of  them  must  be  carried  in  mind  simultaneously. 

c.  The  multiple  cues  should  not  be  such  that  they  singly  lead  to  sepa¬ 
rate  actions.  Rather,  they  should  require  integration  and  the  selection  of 
an  appropriate  response  or  scries  of  responses  governed  jointly  by  the 
several  cues. 

In  order  to  fulfill  these  requirements,  the  test,  in  addition,  should  be 
built  around  some  common  pattern  of  activity  that  would  be  modified  in 
various  ways  by  the  test  conditions. 

An  Hypothesis  Based  on  Factor  Analysis 

The  technique  of  factor  analysis  contributed  another  reason  for  the 
development  of  integration  tests.  The  hypothesis  was  advanced  that  the 
“Mashbum*  factor”  or  “intellectual  component  of  Complex  Coordina¬ 
tion”  (later  identified  as  spatial  relations),  which  has  been  a  constant 
component  of  the  Complex  Coordination  test  in  all  analyses,  involved 
the  ability  to  integrate  a  number  of  disparate  activities  quickly  and  ac¬ 
curately.  In  the  light  of  this  hypothesis,  and  for  the  reasons  enumerated 
previously,  a  battery  of  tests  was  constructed  which,  it  was  hoped,  would 
measure  the  ability  to  integrate.  Tests  designated  as  integration  tests 
were:  (1)  Planning  a  Course;  (2)  Flight  Formations;  (3)  Signal  In¬ 
terpretation;  (4)  Forced  Landings;  (5)  Combat  Planes;  (6)  Complex 
Concentration;  and  (7)  Code  Analysis. 


THE  INTEGRATION  TESTS 


Planning  a  Course,  CI406AX3  * 

This  is  the  final  experimental  form  of  a  test  in  the  area  of  integra¬ 
tion.  The  ability  to  plan  a  course  of  action,  considering  various  factors 
and  exhibiting  proper  division  of  attention,  is  believed  to  represent  one 
type  of  integration. 

Description. — The  examinee  learns  a  simple  set  of  signals  to  which 
appropriate  responses  must  be  made.  He  finds  these  signals  scattered 
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through  a  standard  rectangular  maze,  and  he  encounters  them  as  he  moves 
through  the  maze  by  drawing  a  line  from  the  beginning  to  the  end  of  it. 

(1)  Internal  characteristics. — This  test  consists  T>f  15  sample  items 
and  120  scored  items.  The  problems  are  presented  diagranmiatically,  in 
series  of  five  items. 

(2)  Administration. — The  examinee  receives  a  direction  sheet  and  a 
work  booklet.  Answers  are  marked  directly  on  the  work  booklet.  When 
the  test  is  finished,  answers  ere  transcribed  to  the  regular  IBM  answer 
sheet.  The  time  limit  for  the  test  is  15  minutes,  exclusive  of  transcription 
time.  In  figure  10.1  may  be  seen  the  first  practice  diagram.  A  part  of  the 
directions  for  the  test  follow.  Administrative  directions  that  arc  read  by 
the  administrator  to  supplement  the  directions  sheet  are  printed  in  italics. 

This  is  a  test  of  your  ability  to  modify  a  ? tunned  course  of  action.  Loot  at  prac¬ 
tice  diagram  /.  Your'lask  will  be  to  determine  the  correct  course  through  simitar 
diagrams.  Notice  the  vertical  and  horizontal  pathways  and  the  entrance  marked 
start.  Also  observe  the  leffa*,  R,  L,  DS,  and  CD,  which  are  written  at  the 
starting  point  and  above  the  pathways  at  various  places  in  the  diagram.  These 
letters  signal  the  directions  to  be  taken  when  the  course  is  traced  through  the 
pathways. 

These  signals  and  their  meanings  are  as  follows: 

R —One  move  to  the  right. 

2R— Two  moves  to  the  right. 

3R— Three  moves  to  the  right. 

L~One  move  to  the  left. 

ZL—Two  moves  to  the  left 
31. —  Three  moves  to  the  Isfi. 

•  START 

_ _ ^  L  L _ _ 


□□ 

201  > 

■DO 

oral 

1 

202  \ 

&□< 

i,H  1' 

203  > 

GOOD 

204  > 

» 

DO 

DO 

205  t 

DO 

oq 

FIGURE  10.  i 

SAMPLE  D'AGRAM  OF  PLANNING  A  COURSE, 
C I406AX3 
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DS— Double  signals.  Thai  is,  carry  out  twice  all  signals  which  art  passed  aftes 
ihe  "DS"  signal. 

CD = Cancel  double  signal.  This  signal  removes  ihe  effect  of  Ihe  DS. 

beginning  at  the  word  start,  your  task  w*ll  be  to  follow  the  signals  in  the 
diagram,  tracing  a  course  through  the  pathways  until  the  bottom  line  of  the  dia¬ 
gram  is  reached.  Sometimes  you  will  be  moving  to  the  left  and  sometimes  to  the 
right.  Before  changing  the  direction  of  the  course  from  left  to  right,  or  from  right 
to  left,  always  make  one  move  doten.  Never  turn  back  on  your  corrse,  go  down 
instead. 

Now  look  again  a!  practice  diagram  l .  Use  your  pencil  ond  trace  the  course  in 
the  diagram  as  the  administrator  reads  the  explanation.  .4s  you  go  through  prac¬ 
tice  diagram  I,  other  rules  will  be  brought  out.  These  rules  null  be  summarised 
when  you  are  through  with  the  first  diagram. 

"The  course  begins  at  the  word  start.  The  first  move  to  be  made  is  indicated 
by  the  signal  which  appears  under  the  word  start  In  this  problem  the  first  signal 
is  L.  Therefore,  the  first  move  is  one  square  to  the  left  from  the  starling  point. 
Make  the  move  now."  (Pause.]  'This  move  passes  under  the  signal  R,  which 
means  "make  one  move  to  the  right."  Because  this  signal  indicates  a  change  in  the 
direction  of  the  course,  one  move  down  must  be  made  before  moving  to  the  right 
This  move  down  passes  through  column  B.  Make  the  down  move  through  column 
B  now."  (Pause.]  "Now  move  one  square  to  the  right  to  column  C" 

"The  move  to  the  right  passes  under  the  signal  ZR,  which  means  'make  two  more 
moves  to  the  right*  Make  those  moves  now."  ]  Pause.]  "This  carries  the  course 
to  column  E.  These  last  two  moves  passed  under  the  signals  DS  and  L.  As  this 
DS  doubles  the  L  signal  which  follows  it  tlie  next  move  must  be  twice  L,  or  two 
moves  to  the  left" 

“However,  since  this  is  a  change  in  direction,  a  move  down  must  be  made  through 
the  letter  E  before  the  two  moves  to  the  left  are  made.  Now  make  the  move  down 
through  E,  opposite  201.  Now  move  two  squares  to  the  left  This  takes  you  to 
column  G"  (Pause.]  "These  last  moves  passed  under  CD  and  2L,  which  means 
‘Cancel  double  signal,'  and  ‘Make  two  moves  to  the  left*  " 

"Make  these  two  moves  over  to  column  A  now."  (Pause.)  "These  moves  to  the 
kit  passed  bck>w  JR  which,  since  it  calls  for  a  change  in  direction,  will  be  made 
after  a  down  move  has  been  made  through  the  letter  A,  opposite  202.  Make  the  down 
move  through  A  now."  (Pause.)  "Now  move  the  three  squares  to  the  right  to 
column  D."  {Pause.) 

"In  r.taking  tltcse  three  moves  to  the  right,  the  course  passed  under  a  DS,  and 
R,  and  an  L  signal.  However,  since  the  execution  ol  the  JR  signal  carried  the 
course  to  column  D,  tliere  is  not  enough  space  kft  in  this  row  to  move  the  two 
squares  caikd  for  by  the  doubled  R  signal;  that  is,  (he  DS  followed  by  the  R. 
Therefore,  l!  •  doubled  R  must  be  postponed  until  it  can  be  carried  out,  and  a  move 
down  must  mediately  be  made  through  column  D  instead.  Make  the  down  move 
through  D  now,  opposite  201"  | Pause.) 

"The  signal  L  which  was  also  doubled,  because  it  followed  DS,  also  remains 
to  be  carried  out.  Do  this  now  by  moving  two  squares  kft  to  column  B."  (Pause.) 
"Since  there  are  not  more  kft  moves  to  be  made,  a  down  move  must  be  made 
throijgh  B.  Make  this  move  through  B  now,  opposite  20k"  (Pause.) 

"Ihe  doubkd  R  which  has  not  yet  been  carried  out  can  now  be  executed.  Make 
the  two  moves  right  to  column  D  now,  opposite  205."  (Pause.)  "Now  move  down 
through  D  lioce  there  are  no  more  moves  to  the  right  to  be  made."  {Pause.) 
"Observe  that  you  have  reached  the  bottom  of  the  diagram  without  being  able  to 
carry  out  the  doubkd  L."  (Pause.) 

"Note  that  the  con  t  passes  through  o  letter  whenever  «  dorm  move  is  made 


of  peril  r  one  of  (ht  number  t  on  the  left.  This  Utter  inUcofrt  your  answer.  Do  not 
mark  any  answer  sheet  ai  this  time. 

Answers  will  no r  be  recorded  until  the  entire  test  is  completed.  The  correct 
answers  to  practice  Diajram  l  are:  201  h;  202— A;  203 — D;  20 1-  R;  and  205 — D." 

At  this  point  the  direction.-  are  summarized  again,  and  the  examinees 
work  sample  problems  2  and  3.  Then  they  arc  allowed  IS  minutes* to 
complete  the  120  scored  items. 

(3)  Scoring. — The  scoring  formula  is  R— W/4, 

Statistical  results. — The  data  given  below  are  (or  classified  pilots 
tested  at  Psychological  Research  Unit  No.  3  in  August  1943. 

distribution  statistics. — The  distribution  of  scores  in  this  test  is  de¬ 
scribed  by  a  mean  score  of  71.7  and  a  standard  deviation  of  24.3,  based 
on  a  sample  of  877  classified  pilots.  , 

(2)  Internal  consistency. — The  interna!  consistency  of  items  in  this 
test  is  indicated  by  a  mean  phi  of  0.38,  with  a  range  from  0.07  to  0.83 
and  a  standard  deviation  of  0.14,  based  on  the  highest  27  percent  and  the  < 
lowest  27  percent  of  800  classified  pilots. 

(3)  Reliability  coefficient. — A  reliability  coefficient  of  0.81,  corrected, 
was  obtained  by  *hc  alternate-forms  method  on  a  sample  of  167  classi¬ 
fied  pilots.  This  was  computed  on  a  preliminary  form  of  the  test,  which 
had  two  parts  separately  timed.  Although  the  final  form  of  the  test  was 
not  divided  into  two  parti,  the  items  arc  sufficiently  similar  to  the  pre¬ 
vious  form  so  that  this  run.  hlity  coefficient  can  be  considered  repre¬ 
sentative. 

(4)  Difficulty. — The  difficulty  level  of  items  in  this  test  is  indicated 
by  the  me?n  proportion  of  correct  responses  equal  to  0.72,  corrected  for 
chance  success.  The  proportions  ranged  from  0.16  to  0.99  with  a  stand¬ 
ard  deviation  of  0.20.  These  data  were  based  upon  results  from  167 
classified  pilots. 

(5)  Factorial  composition. — The  chief  factors  arc  spatial  relations 
(0.45),  integration  III  (0  41),  numerical  (0.30),  and  general  reason¬ 
ing  (0.24).  The  communaliiy  is  0.64,  which  is  well  short  of  the  reli¬ 
ability  (0.81). 

(6)  Tost  validity.— The  validity  of  this  test  is  indicated  by  a  hscriai 
correlation  with  the  graduation- e! ruination  criterion  cf  0.17,  uncor¬ 
rected.  This  statistic  is  based  or.  877  classified  pilots  in  class  44C.  The 
proportion  of  graduates  was  09l,  the  mean  score  of  the  graduates  was 
72.46,  the  mean  score  of  the  dimmers  was  64.30,  and  the  standard  devi¬ 
ation  for  all  scores  was  74.32. 

Evaluation. — The  test's  pilot  validity  is  almost  exactly  identical  with 
that  predicted  from  its  factor  pattern  (see  Tabic  28.18).  It  might  be  re¬ 
vised  so  as  to  maximize  its  spatial-relations  loading,  in  which  case  u 
would  be  one  of  the  best  tests  a*  ailable  for  that  factor.  Other  tests,  de¬ 
scribed  in  chapter  19,  are  more  promising  for  this,  however.  Its  numeri¬ 
cal  variance  is  no  aid  to  pilot  prediction,  and  its  loading  in  integration 
III  is  probably  a  definite  handicap.  Its  validity  for  navigator  svlcctior 
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promises  to  be  greater  than  that  for  pilots,  even  if  its  integration  III 
loading  is  zero  in  the  navigator  criterion. 

Parenthetically,  it  is  intcrrsting.to  point  out  that  a  test  that  was  devel¬ 
oped  to  function  in  printed  form  as  the  Complex  Coordination  test  does 
in  apparatus  form,  on  the  basis  of  one  hypothetical  trait  (integration), 
came  out  strongest  in  the  most  valid  factor,  which  in  the  meantime  be¬ 
came  recognized  as  something  quite  different  in  character  (spatial  re¬ 
lations).  The  spatial  characteristics  of  the  task  in  planning  a  course  had 
been  used  as  the  medium  through  which  integrative  aspects  of  behavior 
were  to  be  measured.  Had  the  medium  been  changed,  the  communality 
with  complex  coordination  would  probably  have  been  lost 

Variations  of  the  test. — Two  forms  of  Planning  a  Course,  CI406AX1 
and  CI406AX2,  preceded  the  final  form.  The  changes  introduced  in 
CI406AX3  were  designed  to  shorten  and  clarify  the  directions,  although 
the  essential  characteristics  of  the  test  remained  unchanged.  Greater 
simplicity  and  a  more  nearly  optimum  difficulty  level  were  achieved  in 
the  final  form. 

Flight  Formations,  CI654AX5  4 

This  is  the  last  experimental  form  of  another  test  in  the  integration 
group. 

Description. — The  examinee  must  determine  the  formation  of  a  group 
of  planes  after  certain  moves  have  been  described. 

(1)  Internal  characteristics. — The  test  consists  of  1  unrecorded  sam¬ 
ple  problem,  2  recorded  but  unscored  practice  problems,  and  40  scored 
items— 20  in  part  I  and  20  in  part  II. 

(2)  Administration. — Twelve  minutes  are  allowed  for  part  I  and  10 
minutes  for  part  II.  Following  are  the  first  two  pages  of  directions,  and 
sample  problems  are  given  in  figures  102  and  10.3. 

This  is  a  test  of  your  ability  to  plan  ahead  of  the  plane  in  flight  formations.  The 
formation  of  each  flight  consists  of  three  planes  appearing  in  different  relative 
positions.  Your  task  will  be  to  determine  the  new  formations  of  these  planes  after 
they  have  completed  certain  moves. 
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FIGURE  10.2 

SAMPLE  ITEM  USE0  TO  EXPLAIN  FLIGHT  FORMATION* 

CI0MAXS 

•  Dm!'M4  ii  Fijrtk«l«(Kil  Reiearch  Uuil  K».  J.  Chief  emtriWlin;  U.  WillUa  IL  Wh«sJ*r* 
U.3  n.  Wrifhi. 
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You  will  be  shown  the  relative  order  and  altitude  of  the  three  planes.  In  working 
the  problems  you  must  first  combine  this  order  and  attitude  into  a  flight  formation. 
Order  is  the  relative  positions  of  the  planes  from  left  to  right.  Altitude  is  the  rela* 
tive  positions  of  the  planes  from  top  to  bottom. 

Look  at  figure  I  below,  (fig.  10.2a)  The  small  squares  represent  the  three  differ¬ 
ent  planes.  The  squares  after  the  word  order  show  the  order  of  the  original  forma¬ 
tion.  Here  it  is  striped  plane  at  the  left,  dark  plane  in  the  center,  and  white  plane 
at  the  right.  The  altitude  of  the  original  formation  which  is  shown  by  the  small 
squares  under  ALT  is:  Dark  plane  at  the  top,  white  plane  in  the  middle  and 
striped  plane  at  the  bottom.  When  you  combine  this  order  and  altitude  you  should 
imagine  a  flight  formation  with  the  striped  plane  at  the  left  and  at  the  bottom,  the 
dark  plane  at  the  center  and  at  the  top,  and  the  white  plane  at  the  right  and  in  the 
middle.  In  each  problem  you  must  always  imagine  the  original  formation  by  com¬ 
bining  the  order  and  altitude  before  you  make  any  moves.  After  you  have  deter¬ 
mined  the  original  formation,  you  must  carry  out  the  moves  that  are  described.  In 
figure  I,  the  first  move  is :  Striped  plane  moves  to  the  same  place  in  order  as  the 
white  plane.  The  second  move  is:  White  plane  moves  to  the  same  altitude  as  the 
black  plane.  These  moves  and  the  final  formation  are  shown  in  figure  II.  (fig.  102b) 


SAMPLE  PROBLEM 
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The  dotted  line*  indicate  the  original  formation  positions  of  the  striped  and  white 
planes.  You  must  remember  that  the  correct  final  formation  is  not  determined  by 
the  moves  alone.  The  original  formation  must  be  visualized  before  the  moves  are 
made. 
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Now  examine  the  sample  problems  •  *  *  and  imagine  what  the  original 
formation  of  the  planes  should  be.  (See  fig.  10.3a) 

If  you  have  interpreted  the  order  and  altitude  correctly,  you  should  have 
imagined  a  formation  like  the  one  in  figure  IV.  (fig.  10.3b.) 

Keeping  the  original  formation  in  mind,  make  the  moves  called  for  by  the  prob¬ 
lem.  Here  you  must  imagine  striped  plane  moves  to  the  right  of  white  plane,  and 
striped  plane  then  moves  to  the  right  of  white  plane,  and  striped  plane  then  moves  to 
the  same  altitude  as  black  plane.  Select  the  correct  final  formation  from  the  five 
answers  that  are  given  below  the  problem. 

0  is  the  correct  answer.  Figures  IV  and  V  show  how  the  moves  should  have 
been  made  to  give  you  the  correct  final  formation.  (See  figs.  10.3b  and  10.3c) 


(3)  Scoring. — The  scoring  formula  is  R— W/4. 

Statistical  results. — Data  for  this  test  are  moderately  complete,  in¬ 
cluding  validations  for  pilot  training.  Unclassified  aviation  students  were 
tested  at  Psychological  Research  Unit  No.  3  in  September  1943,  and 
classified  pilots  in  class  44E  were  tested  at  that  unit  ir.  October  1943. 

(1)  Distribution  statistics. — The  distribution  of  scores  in  this  test  is 
indicated  by  a  mean  score  of  12.0  and  a  standard  deviation  of  8.3,  based 
upon  a  sample  of  284  unclassified  aviation  students. 

(2)  Internal  consistency. — The  internal  consistency  of  items  is  indi-  . 
catcd  by  a  mean  phi  of  0.56  for  part  I,  and  0.55  fo  part  II,  with  a 
range  from  0.05  to  0.95  for  the  totat  test  and  l  standard  deviation  of  1 
0.19  for  part  I  and  0.14  for  part  II,  based  upon  the  highest  27  percent 
and  the  lowest  27  percent  of  800  unclassified  aviation  students. 

(3)  Reliability  coefficient. — A  reliability  coefficient  of  0.84,  corrected,  . 
was  obtained  by  the  alternate-forms  method  (Part  I  v.  Part  II)  on  a 
sample  of  1,553  classified  pilot:. 

(4)  Difficulty. — The  difficulty  level  of  items  in  this  test  is  indicated 
by  a  mean  proportion  of  correct  response^  of  0.62,  corrected  for  chance- 
.success,  a  range  from  0.41  to  0.99,  and  a  standard  deviation  of  0.15. 
These  results  are  based  upon  the  responses  of  the  nbove-mentioned  sam¬ 
ple  of  800  unclassified  aviation  students. 

(5)  Factorial  composition.-- The  chief  factors  are:  integration  I 
(0.46),  general  reasoning  (0.22),  spatial  relations  (0.22),  and  inte¬ 
gration  III  (0.21).  The  conmiunality  is  only  0.45,  which  is  to  be  com¬ 
pared  with  a  reliability  oi  0.84. 


Tails  10.1. —  Validity  data  for  Flight  Formations ,  C16MAX5,  with  the  graduation - 
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(6)  Test  validity. — Data  based  on  a  large  sample  in  elementary  and 
basic  training  are  given  in  table  10.1. 


Evaluation.-  blight  Formation,  l‘l#»54A\5,  has  a  moderately  high 
loading  in  the  new  factor  called  integration  I.  Since  its  other  loadings 
fail  to  account  for  its  validity  of  0.23,  the  validity  of  this  factor  for  pitot 
selection  should  be  considerable.  Assuming  that  this  factor  v  daddy  is 
0.25,  the  test  validity  is  almost  fully  accounted  for.  Because  of  its  unique 
contribution,  this  test  should  be  purified.  Its  reliability  is  satisfactory.  It 
would  liave  a  validity  of  at  least  0.20  for  navigators  without  including 
any  integration  variance,  whose  navigator  validity  is  unknown. 

Flight  formation  might  well  be  expected  to  be  a  visualization  test  In 
a  factor  analysis  of  the  integration  battery,  however,  it  revealed  a  load¬ 
ing  of  only  0.04  in  the  visualization  factor.  No  result  could  be  more  de¬ 
cisive  on  this  point.  The  conclusion  should  be  that  if  this  test  is  one  of 
visualizing,  it  is  a  different  type  than  that  common  to  known  tests. 

Signal  Interpretation,  CI6S6A  * 

This  is  another  test  in  the  integration  area. 

Description. — In  each  problem  there  is  a  diagrammatic  representation 
of  10  airplane  carriers  in  a  row.  The  examinee  must  determine  from 
which  of  these  carriers  planes  will  take  off.  This  can  be  ascertained  by 
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FIGURE  10.4 

SAMPLE  ITEMS  OF  SIGNAL  INTERPRETATION* 

CI6S6A 
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interpreting  certain  signals  such  as  the  number  of  flags  on  the  mast  of 
each  ship,  the  direction  the  ship  is  heading,  and  whether  there  are  more 
or  fewer  flags  than  on  the  previous  take-off  ship. 

The  examinee  must  compare  quickly  the  number  of  flags  on  each  ship 
with  the  number  on  the  previous  take-off  ship.  He  must  decide  from  the 
relationship  between  these  two  the  direction  of  the  next  take-off  ship. 

(1)  Internal  characteristics. — As  has  been  stated,  each  problem  con¬ 
sists  of  a  scries  of  10  ships.  The  test  consists  of  two  unscored  sample 
problems  and  15  scored  problems,  yielding  150  scored  items. 

(2)  Administration. — Answers  are  marked  on  an  expendable  test 
booklet  by  circling  the  appropriate  arrow  under  each  item.  The  examinee 
also  receives  a  single  directions  sheet.  When  the  test  is  completed,  he  is 
instructed  to  transcribe  his  answers  to  a  regular  IBM  answer  sheet. 
Testing  time  is  limited  to  7  minutes.  Approximately  9  minutes  are  re¬ 
quired  for  administration  and  13  minutes  for  the  transcription  of  an¬ 
swers. 

In  figure  10.4  is  shown  the  first  series  of  10  sample  items.  The  follow¬ 
ing  are  parts  of  the  directions : 

Each  problem  is  made  up  of  a  row  of  10  ships.  Your  task  is  to  determine  which 
ships  in  each  rojr  carry  planes.  You  will  locate  these  carriers  by  following  certain 

signals. 

Look  at  sample  problem  A  on  your  work  booklet  (See  fig.  10.4.)  Note  that  some 
of  the  ships  fly  no  flags,  some  fly  one  flag,  some  two,  and  some  three.  Note  also 
the  arrows  under  the  ships.  The  arrows  after  the  tetter  A  all  point  to  the  kft; 
those  after  B  all  point  to  the  right  The  flags  on  the  ships  and  the  arrows  under 
the  ships  are  the  signals  which  you  must  interpret  in  order  to  locate  the  aircraft 
carriers. 

Here  is  the  way  you  must  interpret  these  signals.  Follow  these  directions  closely: 

1.  The  first  ship  in  each  row  is  always  a  carrier.  So  in  the  example;  ship  Na  1 
is  a  carrier. 

2.  The  number  of  flags  on  the  mast  of  each  carrier  shows  how  many  ships  iic 
between  it  and  the  next  carrier.  In  the  example,  the  first  carrier,  ship  Na  1,  flies 
two  flags.  Thus,  there  are  two  ships  between  ship  Na  1  and  the  next  carrier.  Ship 
Na  4  is  the  next  carrier. 

3.  Compare  the  number  of  flags  on  ship  Na  4  with  the  number  of  flags  on  the 
carrier  you  just  left,  ship  No.  1.  You  must  compare  the  number  of  flags  in  order 
to  determine  the  direction  in  which  you  must  go  to  locate  the  next  carrier  ship. 

a  If  the  present  carrier  flies  more  flags  than  the  one  you  just  left,  you  will 
continue  in  the  direction  you  are  going. 

b.  If  there  are  the  same  number  or  fewer  flags  on  the  present  carrier  than 
on  the  one  you  just  left,  reverse  your  direction  before  looking  for  the  next  carrier. 

c.  You  must  show  your  direction  by  circling  the  arrow  tinder  the  present 
carrier  which  points  in  the  direction  you  will  ga 

As  ship  Na  4  has  more  flags  than  ship  Na  1,  the  previous  carrier,  circle  the 
arrow  under  ship  Na  4  that  points  to  the  right  and  continue  in  the  same  direction 
in  finding  the  next  carrier. 

Always  be  sure  to  circle  one  arrow,  and  only  one,  under  each  carrier  you  locate. 

Now  complete  the  first  sample  problem. 

As  ship  Na  4  flies  three  flags,  and  you  have  circled  the  arrow  under  it  pointing 
to  the  right,  you  must  skip  three  ships  to  the  right  in  order  to  locate  your  next 
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carrier.  This  will  make  ship  No.  8  your  next  carrier.  Ship  No.  8  flies  the  same  num* 
ber  of  flags  as  the  previous  carrier,  ship  No.  4,  so  you  must  reverse  your  direction 
before  locating  your  next  carrier.  As  your  direction  is  changed,  you  must  circle  the 
arrow  under  ship  No.  8  which  points  to  the  left,  your  new  direction. 

Ship  No.  8  flies  tiiree  flags.  You  must  now  skip  three  ships  to  the  left  to  And 
the  next  carrier.  This  is  ship  No.  4,  which  had  previously  been  found  to  be  a 
carrier.  When  the  signals  direct  you  t©  a  ship  which  you  have  already  marked  as  a 
carrier,  the  problem  is  completed  Do  not  circle  more  than  one  arrow  under  any 
one  ship. 

(3)  Scoring. — The  last  problem  is  omitted  from  scoring  because  of 
faulty  reproduction.  The  scoring  formula  is  R— W/2.  A  right  response 
consists  in  correctly  circling  or  not  circling  each  pair  of  arrows;  the 
maximum  score,  therefore,  is  126.' 

Statistical  results. — Data  on  distributions,  reliability,  and  validity  are 
available. 

(1)  Distribution  statistics. — Distribution  constants  for  this  test  are 
shown  in  table  10.2. 

Table  10.2. —  Data  on  distribution  of  scores  for  Signal  Interpretation ,  C1656AXT 

and  CI656A 
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Unclassified  aviation  students'  ....  28 J 

Classified  pilots*  . . . . .  1.181 

84.1 

I0J.S 

■  A  description  ox  tms  lorm  is  given  on  pact  — * 

*  Tested  it  Psychological  Research  Unit  No.  3  on  October  Ip  1943. 

•  In  class  44G.  Tested  at  Psychological  Research  Unit  Nt.  3. 

(2)  Reliability  coefficient.— An  odd-even  reliability  coefficient  of  0.77, 
corrected,  was  obtained  for  form  AX2  on  a  sample  of  285  unclassified 
aviation  students  tested  in  October  1943  at  Psychological  Research  Unit 


No.  3. 


(3)  Difficulty. — The  difficulty  level  for  form  A  is  indicated  by  a  mean 
proportion  of  correct  responses  of  0.73,  corrected,  with  a  standard  devi¬ 
ation  of  0.20  and  a  range  from  0.16  to  1.00.  This  result  is  based  upon 
727  classified  pilots  in  class  44G  who  were  tested  in  December  1943  and 
January  1944  at  Psychological  Research  Unit  No.  3. 

(4)  Factorial  composition.— The  chief  factor  loadings  found  in  the 
AX2  form  are  in  the  integration  I  (0.59),  general-reasoning  (0.41), 
and  integration  III  (0.30)  factors.  The  communality  (0.69)  almost 
reaches  the  reliability  (077)  but  possibly  does  not  account  for  all  of  the 


nonchance  variance.  % 

(5)  Test  validity.— The  pilot  validity  of  form  A  of  the  test  is  indi¬ 
cated  by  a  corrected  biserial  correlation  of  0.21  with  the  primary-training 
graduation-elimination  criterion.  This  statistic  is  based  on  2,112  pilots 
in  classes  44G  and  44H  who  had  been  tested  at  Psychological  Research 
Unit  No.  3.  The  percentage  of  graduates  was  0.89,  the  mean  score  of  the 
graduates  97.9,  the  mean  score  of  the  diminees  85.5,  and  the  standard 
deviation  of  all  was  357. 


*  Wkil*  thtre  ere  MO  torml  rttywin,  wk  »l  lk«  M  ecoeed  iMw 
■Mur  lift*. 
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(6)  Item  validity.— The  validity  of  items  is  indicated  by  a  mean  phi 
of  0.07,  a  standard  deviation  of  0.06  and  a  range  from  — 0.1 1  to  4-0.21. 
This  is  based  on  727  pilots  in  primary  training  (class  44G;  tested  at 
Psychological  P  ,  -  Unit  No.  3),  of  whom  127  were  eliminees. 

Evaluation. — Signal  Interpretation,  CI656AX2,  defines,  to  a  greater 
extent  than  any  other  test,  a  new  factor  identified  as  integration  I 
whose  pilot  validity  has  not  yet  been  established  but  which  appears  to  be 
near  0.25.  The  test  has  a  loading  of  0.59  in  this  new  factor.  Because  the 
test  does  not  have  high  loadings  on  known  valid  factors  for  pilots,  it  is 
reasonable  to  suppose  that  it  derives  much  of  its  validity  for  pilots  (0.21 ) 
from  the  integration  I  factor.  Further  experimentation  on  thb  factor, 
therefore,  is  warranted.  The  loadings  on  integration  III  and  general 
reasoning  should  by  all  means  be  reduced,  leaving  the  test  practically 
pure  for  integration  I. 

Variations  of  the  test. — Signal  Interpretation,  CI656A,  was  preceded 
by  CI656AX2.  This  earlier  form  contained  all  the  essential  elements  of 
the  A  form.  Its  directions,  however,  were  considerably  longer  and  more 
complicated,  and  it  was  divided  into  two  parts,  the  task  of  part  II  being 
complicated  by  additional  signals.  Each  part  includes  9  practice  items 
and  90  test  items. 

This  test  exemplifies  the  difficulty  encountered  in  writing  directions 
for  all  the  tests  in  the  integration  area.  The  rationale  for  the  area  lists 
the  considerations  that  were  observed  in  constructing  tests  to  measure 
integration.  From  these  points  it  is  dear  that  the  tests  necessarily  had 
to  be  complex.  Consequently,  there  was  an  inherent  problem  of  writing 
effective  directions  to  describe  a  complex  task. 

A  revision  of  the  A  form,  CI656B,  was  begun  but  never  completed.  It 
represented  an  attempt  to  simplify  the  directions  further.  A  form, 
CI656C,  was  later  prepared  to  administer  for  intcrcorrelation  studies. 

Forced  Landings,  CI652 AX  M3652A  * 

These  are  th*  final  two  forms  of  another  test  in  the  integration  group. 
It  was  belicvea  uiat  this  test  would  require  decisions  such  as  the  pilot 
must  make  in  complex  situations  involved  in  forced  landings. 

Description. — Planes  of  varying  size  (single-engine,  twin-engine,  four- 
engine)  are  presented  on  two-dimensional  grids  representing  (1)  alti¬ 
tude  and  (2)  distances  to  various  landing  fields  which  differ  in  desir¬ 
ability.  Wind  arrows  indicate  updraft  and  downdraft,  which  add  to  or 
subtract  from  the  gliding  range.  The  examinee  is  required  to  select  for 
each  plane,  in  order,  the  best  landing  field  within  range.  Thus,  he  must, 
as  quickly  as  possible,  integrate  the  facts  concerning  the  type  of  plane, 
its  location  in  relation  to  a  landing  field,  effect  of  the  wind  on  its  gliding 
range,  and  the  desirability  of  the  landing  fields  in  determining  the  best 
field  upon  which  to  land  each  plane. 

•  PmtwH  it  PmWntiol  KtinrHl  Unit  It*.  J.  CVtt  cMttiWhri:  U  U«n»  G.  Cm rptmttr, 
Jr,  S/S«t.  J.  Ctdtm  Em*.  He.  Ctrl  Jw. 
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(1)  Internal  characteristics. — Forced  Landings,  CI652.W  vsis'.s 
of  3  parts  of  30  scored  items  each.  It  contains  five  recorded  but  un¬ 
scored  sample  items.  The  parts  are  progressively  more  difficult.  Single- 
engine  airplanes  appear  in  part  I.  Twin-engine,  as  well  as  single-engine 
airplanes,  appear  in  part  II.  Twin-engine  planes  can  glide  twice  as  far, 
winds  affect  them  twice  as  much,  etc.  Four-engine,  as  well  as  single- 
engine  and  twin-engine  planes,  appear  in  part  III.  Four-engine  planes 
car.  glide  three  times  as  far  as  single-engine  planes,  winds  affect  them 
three  times  as  much,  etc.  Form  CI6S2A  consists  of  2  parts  of  30  scored 
items  each,  and  5  recorded  but  unscored  sample  items.  Part  I  contains 
single-engine  planes.  Part  II  contains  single-  and  twin-engine  planes. 

(2)  Administration. — The  CI652AX4  form  was  administered  as  part 
of  the  experimental  integration  battery  in  September  1943.  Then,  direc¬ 
tions  were  simplified  and  the  30  items  in  part  III  were  dropped  in  mak¬ 
ing  the  CI652A  form.  It  was  decided  to  delete  part  III,  because  the 
test  was  lengthy,  and  the  interpart  correlations  were  high. 

Eight  minutes  are  allowed  for  part  I,  7  minutes  for  part  II,  and  (in 
the  CI6S2AX4  form)  7  minutes  for  part  III.  A  single  page  of  direc¬ 
tions  is  provided.  Following  are  the  directions  for  part  I  of  CI652A; 
the  five  sample  problems  are  reproduced  in  figure  10.5.  Comments  made 
by  the  administrator,  which  are  not  on  the  directions  sheet,  are  italicized. 

This  is  a  test  of  ycur  ability  to  make  decisions  quickly  in  problems  similar  to 
forced  landings. 


The  planes  are  numbered  consecutively,  while  the  fields  are  lettered  A,  B,  C,  D, 
or  E.  Each  plane  is  directly  above  the  field  on  the  same  line.  The  diagrams  show 
only  two  dimensions;  altitude  and  horizontal  distance: 


FIGURE  10.5 

SAMPLE  PROBLEMS  OF  FORCCO  LANDINGS, 
CteS2A 


203 


a.  Altitude,  in  miles,  is  shown  by  the  numbers  at  the  left  and  right  of  the  dia- 
gram. 

b.  The  distance  from  one  vertical  line  to  the  next  is  one  mile.  The  distance 
a  plane  needs  to  glide  to  get  to  a  field  is  measured  sideways  only.  Count  the  spaces 
between  the  plane  and  the  field  to  determine  the  number  of  miles  between  them. 

Each  arrow  at  the  left  indicates  the  direction  of  winds  affecting  any  planes 
at  the  same  altitude. 

Now  look  at  the  front  of  the  work  booklet  Hold  it  up  in  front  of  you. 

The  planet  are  in  the  air  directly  cbove  the  field  on  the  same  line.  For  example, 
plane  61  it  directly  above  the  field  C.  The  higher  the  planet  are  on  the  chart,  the 
more  altitude  they  have.  For  example,  plane  61  is  2  miles  high  as  indicated  by  the 
number  at  the  left.  How  high  is  plane  62 f  (Pause.)  Three  miles  is  right. 

Hoto  far  a  plane  needs  to  glide  to  get  to  any  field  is  measured  by  ground  dis¬ 
tance  only.  For  example,  plane  61  is  1  mile  from  field  B.  How  far  will  plane  61 
glide  to  get  to  field  Et  [Pause.]  Six  miles  is  correct,  because  we  ignore  altitude 
in  computing  distance.  How  far  is  plane  62  from  field  AT  (Pause.)  Five  miles  is 
correct. 

Follow  along  on  your  instruction  sheet  again.  Glance  at  your  diagram  from 
time  to  time. 

Gliding  range: 

A  plane  can  glide  in  either  direction  as  many  miles  as  it  has  miles  of  altitude. 
The  arrows  at  the  same  altitude  from  which  a  p’  lie  starts  indicate  the  winds  which 
will  affect  the  gliding  range  of  the  plane.  Their  effect  is  as  follows: 

A.  Gliding  with  the  wind  adds  1  mile  to  your  gliding  range. 

B.  Gliding  against  the  wind  subtracts  1  mile  from  your  gliding  range. 

C  An  updraft  (t)  adds  1  mile  to  your  gliding  range. 

D.  A  downdraft  (!)  subtracts  1  mile  from  your  gliding  range. 

A  plane  is  not  affected  by  any  wlids  or  drafts  other  than  those  at  the  altitude 
from  which  it  starts. 

Landing  rules: 

A.  Do  not  land  a  plane  on  the  field  over  which  it  starts. 

B.  Land  the  planes  in  order  (first  No.  1,  then  No.  2,  then  No.  3,  etc). 

C  Two  planes  may  not  land  in  succession  on  the  same  field  For  example  plane 
2  mav  not  land  at  the  same  field,  but  1  and  3  may. 

D.  Land  at  the  best  available  field.  The  fields  are  graded  A,  B,  C,  D,  and  E. 
Fktd  A  is  best.  Field  E  is  worst 

E.  Land  at  the  nearer  of  two  equally  good  fields.  For  example  if  a  plane  has 
a  choice  of  two  grade  C  fields,  choose  the  nearer  one 

F.  Measure  the  distance  between  a  plane  and  the  binding  field  by  counting  the 
spaces  only,  not  the  altitude 

Caution.— Once  you  have  chosen  the  best  field  for  a  plane,  do  not  go  back  and 
change  it  to  get  a  better  grade  field  for  the  next  plane. 

Now  look  at  the  front  of  your  work  booklet,  below  the  diagram. 

Here  tee  have  the  steps  to  follow  in  lani  mg  plane  Ho.  61.  Notice  on  the  diagram 
the  altitude  as  indicated  by  the  numbers,  and  the  wind  and  updraft  as  indicated  by 
the  arrows,  as  we  figure  the  gliding  range  ta  each  direction  for  plane  No.  61. 

Snr  Owl— Gliding  range  to  your  left  is  2  miles  (altitude  gives  2  miles,  updraft 
adds  1  mile,  and  roing  against  the  wind  subtracts  a  mile).  Gliding  range  to  your 
right  is  4  •$'  «.«;■  'ude  gives  2  miles,  updraft  adds  I  mile,  and  going  with  the 

wind  adds  a  oi). 

Snr  Two.— Koie  the  fields  within  gliding  range.  Fields  B,  D,  and  C  are  within 
4  miles  on  your  right,  none  is  to  the  left.  The  other  C  field  cannot  be  used  because 
plane  No,  61  is  directly  above  H. 


Step  Three.— Select  the  best  field.  Field  B  is  best,  because  B  is  a  better  grade 
field  than  C  or  D. 

On  your  answer  sheet,  black  in  the  space  under  letter  B  after  item  61. 

Notice  that  plane  61  went  to  a  field  only  !  mile  away,  even  tliough  it  could  have 
glided  4  miles.  In  other  words,  you  can  land  a  plane  at  fields  anywhere  within  the 
gliding  range. 

(3)  Scoring. — The  scoring  formula  is  R—W/4. 

Statistical  results. — The  data  below  are  for  classified  pilots  in  class 
44J  tested  at  Psychological  Research  Unit  No.  3  from  April  10  to  13, 
1944,  and  for  unclassified  aviation  students  tested  at  that  unit  on  March 
9  and  April  16, 1944,  and  on  August  27, 1943. 

(1)  Distribution  statistics. — Typical  distribution  constants  for  this 
test  are  shown  in  table  10.3. 


Table  10.3. —  Distribution  constants  for  Forced  Landings,  C16S2A  and  CI652AX4 


Form 
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Group 
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N 
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Classified  pilots  . 

Unclassified  aviation  students 

k 
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2 *4 

HI 

lfi.4 

na 

sat 

(2)  Internal  consistency. — The  internal  consistency  of  items  in  form 
CI652A  is  indicated  by  a  mean  phi  of  0.41,  with  a  range  from  0.10  to 
0.70  and  a  standard  deviation  of  0.15,  based  on  the  highest  27  percent 
and  the  lowest  27  percent  of  750  classified  pilots. 

(3)  Reliability  coefficient.—' This  has  been  estimated  from  two  sam¬ 
ples  by  the  alternate-forms  method.  The  data  are  presented  in  table  10.4. 


Table  10.4.—  Reliability  coefficients  for  Forced  Landings,  CI652A  and  CI652AX4 
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Part  1  r.  Part  III  . 

Part  11  r.  Part  III  . 
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S 

an 
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M 

M 

(4)  Difficulty. — The  difficulty  level  of  items  in  form  CI652A  is  indi¬ 
cated  by  the  mean  proportion  of  correct  responses  equal  to  0.47,  cor¬ 
rected  for  chance  success  wil-i  a  standard  deviation  of  0.22  and  a  range 
from  0.18  tq  0.79  based  on  a  sample  of  750  classified  pilots. 

(5)  Factorial  composition. — The  chief  factors  of  form  CI652AX4  are 
general  reasoning  (0.53)  and  integration  II  (0.38).  No  other  loading 
exceeds  0.18  (verbal).  The  communality  is  0.53,  which  is  considerably 
short  of  the  reliability. 

(6)  Test  validity. — Validation  has  been  determined  for  part  scores 
and  total  score,  also  right  scores  and  wrong  scores.  The  data  are  pre¬ 
sented  in  table  10.5. 
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Ta*lc  10.5. —  Validity  data  jor  Forced  Landings,  Cl 652 A,  based  on  pilots  in 
primary  training  *  ivilh  the  graduation-elimination  criterion 
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Evaluation. — Forced  Landings,  CI6S2AX4,  is  a  fair  test  of  general 
reasoning.  Its  validity  for  pilots  should  be  only  about  0.12,  allowing  for 
some  validity  for  the  integration  factor.  The  navigator  validity  would 
probably  exceed  0.20  from  the  reasoning  component  alone.  Because  of 
the  intrinsic  complexity  of  the  task  in  forced  landings,  it  is  doubtful  that 
the  reasoning  loading  could  be  decreased  if  that  were  desired.  If  it  could 
be  rid  of  its  loading  in  integration  II,  it  would  be  the  best  general  rea¬ 
soning  test  developed  in  the  program.  Possibly  this  test  could  be  devel¬ 
oped  as  a  navigator-selection  instrument,  but  it  lacks  promise  as  a  pilot 
test 

Combat  Planes,  CI655AX5  * 

This  test  emphasizes  the  ability  to  carry  out  complicated  directions, 
keeping  in  mind  restricting  rules. 

Description. — In  this  test  two  squadrons  of  planes  in  mock  combat 
are  represented.  The-planes  vary  in  type  (single-engine,  twin-engine,  and 
four-engine).  The  task  is  to  determine  as  quickly  as  possible,  from  mock 
combat  rules  given  in  the  directions,  which  opponents  each  plane  can 
attack.  The  examinee  is  required  to  indicate  which  opponents  are  at¬ 
tacked  and  (in  parts  II  and  III)  whether  a  plane  stys  on  the  offensive 
or  changes  to  the  defensive.  In  making  these  decisions,  he  must  take  into 
consideration  the  size  of  each  plane,  the  proximity  of  the  opponents,  and 
the  identity  of  the  squadron  which  starts  the  offensive. 

(1)  Internal  characteristics. — Part  I  of  Combat  Planes,  CI655AX5, 
contains  1  sample  column  of  6  unrecorded  and  unscorcd  items,  1  prac¬ 
tice  column  of  10  recorded  but  unscored  sample  items,  and  90  scored 
items.  Part  II  contains  a  sample  column  of  6  unrecorded  and  unscorcd 
items,  1  practice  column  of  10  recorded  but  unscored  sample  items,  and 
60  scored  items.  Part  III  contains  90  scored  items. 

(2)  Administration. — Four  and  one-half  minutes  are  allowed  for  each 
part 

Following  are  the  directions  and  sample  items  for  part  I.  The  words 
in  italics  are  part  of  the  administrative  directions  and  do  not  appear  in 
the  test  booklet 

*  PrwhH  M  riTtMwxil  Rnttitk  \Jwlt  N*.  1  CM  NMriWMn:  T/S*i  SuM  J. 
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In  this  test  you  will  be  asked  to  apply  rules  to  a  simplified  serial  combat  situation. 
You  will  have  to  determine: 

1.  If  a  plane  can  attack,  and 

2.  Whom  it  can  attack. 

In  order  for  a  plane  to  attack,  it  must  meet  three  conditions: 

Condition  I.  l!  must  be  on  the  offensive. 

Condition  2.  It  must  he  immediately  next  to  one  or  two  opponents. 

Condition  3.  It  must  be  the  same  size  or  smaller  titan  its  opponents. 

I-ook  at  the  diagram  below.  (See  fig.  10.6.) 


[  tu'i  c  < 

Ha  *  c.  r. 

/ 

»r.m' 

r  mat 

r—-v  '•aa:a 

A  wan 

11  ?<*m 

1 

■ -“I  HI 

Man  iuvt 

lam  kosu 

FIGURE  10.6 

EXPLANATORY  DIAGRAMS  FOR  COMBAT  PLANES. 

CI655AX5 

Observe  that  fighters  (single-engine)  can  attack  any  tyfe  of  flane — other  fighters, 
medium  bombers,  or  hear y  bombers.  Medium  bombers  (Inin-engine)  can  attack  the 
medium  bombers  and  heavy  bombers.  Heavy  bombers  (f cur-engine)  can  attack  only 
other  heavy  bombers. 

At  the  top  of  each  column  you  will  be  told  whether  White  or  Black  Squadron 
is  on  the  offensive.  Sometimes  it  will  be  one,  sometimes  the  other.  Be  sure  to  check 
this  for  each  column. 

In  the  sample.  Black  Squadron  is  on  the  offensive.  (See  fig.  10.7.) 

Plane  SI  is  on  the  offensive;  it  is  next  to  an  offonent,  and  it  is  smaller  than  its 
offonent  Therefore,  it  meets  the  three  conditions  and  can  attack  flane  S2  below  it. 
Plane  S2  is  on  the  defensive.  It  does  not  meet  condition  I,  and  therefore,  cannot 
attack.  Plane  S3  is  on  tke  offensive  and  is  next  to  Keo  offon.nls.  Is  it  the  samt 
sice  or  smaller  th.su  its  offoncnls  f  |Patt'e.|  /.'  is  larger  than  flane  S2  above;  lh.ee- 
fort,  it  cannot  attack  this  flane.  It  is  the  same  sice  as  flane  SI  below,  and  there¬ 
fore,  can  attack  S4.  SI  is  on  the  defensive  and  cannot  attack.  S5  is  on  the  offensive, 
is  nest  to  on  offon>nt,  and  is  smaller  than  its  offonenl.  It  can  attack  flane  SI 
abate.  Plane  S6  is  on  the  offensive  but  u  not  immediately  next  to  one  or  two  of- 
fatten  Is  Therefore,  it  cannot  attack. 

RILES  FOR  MARKING  ANSWERS 

Mark  A  on  your  answer  sheet  opposite  the  nundier  of  the  problem,  if  the  plane 
aliove  is  attacked.  A  stands  for  above. 

Mark  B  on  your  answer  sheet  opposite  tire  number  of  the  problem,  if  the  plane 
below  is  attacked.  B  stands  for  below. 
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FIGURE  10.7 

SAMPLE  PROBLEMS  FOR  COMBAT  PLANES, 
CI655AX5 
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Mark  C  on  your  answer  sheet  opposite  the  number  of  the  problem,  if  both 
planes  (above  and  below)  are  attacked. 

Mark  D  on  your  answer  sheet  opposite  the  number  of  the  first  plane  that  is  on 
the  defensive  in  each  column.  D  indicates  tJjat  this  plane  and  members  of  the 
same  squadron  below  it  arc  on  the  defensive.  Make  no  marks  for  any  other  defensive 
planet. 

Mark  E  on  your  answer  sheet  opposite  the  number  of  the  problem  if  the  plane 
is  on  the  offensive,  but  can  not  attack. 

In  parts  II  and  III,  a  complication  is  introduced.  The  new  rule  is  that 
if  a  plane  has  two  opponents  and  cannot  attack  cither  of  them,  the  offen¬ 
sive  changes.  In  addition  to  applying  this  new  rule,  the  examinee  must 
now  not  only  answer  D  for  the  first  defensive  plane  in  each  column,  but 
also  for  every  plane  with  which  the  offensive  changes. 

Each  part  of  the  test  is  separately  timed.  The  test  is  highly  speeded. 

(3)  Scoring. — The  scoring  formula  is  R— W/S. 
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Statistical  results. — Data  arc  quite  limited  but  are  sufficient  to  support 
a  tentative  evaluation.  The  results  arc  for  examinees  tested  in  October 
1943  at  Psychological  Research  Unit  No.  3. 

(1)  Distribution  statistics.— For  the  three  parts  separately,  data  are 
given  in  table  10.6. 


Table  10.6. —  Distribution  constants  for  the  three  forts  of  Combat  Planes, 
C16S5AXS,  based  on  273  unclassified  ziation  students 


Part 

U 

SD 

20.6 

15  ) 

zi\i 

|0.9 

ns 

15.5 

(2)  Reliability  coefficients. — Some  indication  of  reliability  of  the 
parts  is  given  in  table  10.7. 


Table  10.7. —  Alternate-forms  reliability  coefficients  for  Combat  Planes,  CI&S5AX3, 
based  on  a  sample  of  273  unclassified  aviation  students 


Part* 


Part  I  versus  Part  It  . 
Part  I  versus  Part  HI 
Pare  11  versus  Part  III 


0.66 

.65 

.01 


JS 


1  Owing  to  disparities  in  pan  dispersions,  the  Spearman-Brown  correction  formula  was  set 
applied  in  all  three  combination. 


(3)  Factorial  composition. — The  factorial  picture  of  this  test  shows 
noteworthy  loadings  in  the  integration  I  (0.57),  general-reasoning 
(0.33),  verbal  (0.31),  and  integration  111  (0.28)  factors. 

Evaluation. — Combat  Planes,  CI655AX5,  helps  to  define  a  new  fac¬ 
tor,  identified  as  integration  I  which  accounts  for  32  percent  of  the 
total  variance  of  the  test.  Since  the  test  was  not  administered  for  valida¬ 
tion,  nothing  positive  can  be  said  concerning  its  validity  for  air-crew 
success.  Based  upon  the  factor  validities  given  in  table  28.17,  however, 
the  predicted  pilot  validity  is  0.18.  This  is  equal  to  the  expected  validity 
for  Signal  Interpretation,  which  it  resembles  closely  factorially.  Conclu¬ 
sions  regarding  that  test  also  apply  here.  The  chief  difference  is  that 
Signal  Interpretation  has  a  larger  reasoning  variance  and  this  test  a 
larger  verbal  component.  Both  need  purifying. 

Variations  of  the  lest. — Four  forms  preceded  Combat  Planes, 
CI655AX5.  Statistical  analysis  was  not  responsible  for  the  development 
of  these  forms.  Indeed,  no  data  are  available  for  the  early  revisions.  The 
changes  were  prompted  by  the  necessity  for  making  the  directions  more 
readily  understandable. 

Fonns  X2,  X3,  and  X4  were  known  by  the  title  of  Attacking  Planes. 
X2  involved  a  concept  of  support.  Even  if  a  plane  could  meet  the  three 
prescribed  conditions,  and  it  faced  two  opponents,  it  could  not  attack 
unless  given  support  from  the  nearest  plane  or  planes  heading  in  the 
same  direction.  An  adjacent  plane  could  give  support  only  if  both  the 
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opponents  of  the  attacking  plane  were  of  the  type  the  supporting  plane 
could  attack.  This  concept  proved  too  difficult  to  explain  adequately  in 
a  directions  period  of  limited  duration.  The  task  in  the  X3  revision  was 
simplified  by  omission  of  this  support  concept  until  part  II.  This  gave 
the  examinees  an  opportunity  to  become  familiar  with  the  basic  problems 
of  the  test  in  part  I  before  encountering  the  complex  matter  of  support. 
This  also  proved  unsatisfactory,  however,  and  ail  references  to  support 
were  dropped  in  the  X4  form.  Instead,  the  idea  of  a  change  in  offensive 
was  introduced  in  part  II.  If  a  plane  had  two  opponents  and  could  not 
attack  cither  of  them,  the  offensive  changed  to  the  other  squadron.  An¬ 
other  important  change  in  form  X4  was  the  placement  of  the  items  in 
vertical  columns.  In  forms  X2  and  X3  the  offensive  and  defensive  squad¬ 
rons  were  opposite  each  other  in  horizontal  rows.  The  vertical  presenta¬ 
tion  of  items  facilitated  the  administration  of  tlx  cst.  This  feature  was 
incorporated  in  the  final  form  of  the  test,  Combat  Planes,  CI655AX5, 
together  with  numerous  improvements  in  the  arrangement  and  writing 
of  the  directions. 

Complex  Concentration,  CI658AX1  * 

This  i$  the  only  test,  in  the  area  of  integration,  in  which  motion  pic- 
tuifcs  were  used.  There  were  two  factors  that  determined  the  presentation 
of  the  test  in  motion-picture  form.  First,  the  test  involved  the  use  of 
color,  ami  it  was  extremely  difficult  to  obtain  color  printing.  Another  cir¬ 
cumstance  determining  the  use  of  motion  picture’s  was  the  precise  timing 
required.  Most  of  the  color  patterns  were  to  be  exhibited  for  only  three 
seconds.  Uniformity  in  each  administration  would  have  been  extremely 
difficult  to  achieve  if  test  booklets  and  ordinary  testing  procedures  had 
been  used. 

The  material  to  be  photographed,  which  consisted  of  varying  numbers 
qf  2-inch  colored  squares,  was  mounted  on  a  set  of  gray  background 
cards.  Directions  were  printed  and  mounted  on  similar  cards.  These  were 
placed  in  proper  order,  then  each  card  was  photographed  separately, 
being  exposed  to  the  camera  for  a  predetermined  period.  The"  result  is 
a  relatively  smooth  sequence  of  contiguous,  immobile  cards. 

Description. — For  each  problem,  three  groups  of  differently  colored 
squares  are  presented  on  the  screen  in  rapid  succession.  Each  group  con¬ 
tains  three  colors  and  is  visible  for  three  seconds.  At  the  conclusion  of 
each  scries  of  three  groups,  the  examinee  must  record  the  total  number 
of  times  he  believes  each  color  appeared.  Thus,  as  he  sees  each  new 
group  within  a  set,  he  must  add  the  colors  to  his  previous  totals  in  that 
set  without  forgetting  or  confusing  colors  or  frequencies.  As  the  test 
progresses,  the  number  of  colors  included  in  each  group  increases  from 
/three  to  four  and  then  to  five. 

(I)  Internal  characteristics. — The  test  consists  of  two  unscored  sam¬ 
ple  scries.  The  test  is  divided  into  two  parts.  Each  part  contains  16  series 


*  Developed  at  Paycholojical  Reiearch  Unit  No.  3.  Chief  contributor:  S*t.  Hyman  Heller. 


and  61  scored  items.  Running  time  for  the  him  is  27  minutes.  Trans* 
cription  of  answers  from  the  work  sheet  to  the  regular  5-place  IBM 
answer  sheet  requires  approximately  10  minutes. 

(2)  Administration. — Each  examinee  receives  an  expendable  work 
sheet  on  which  answers  are  first  recorded.  Sufficient  lighting  is  provided 
in  the  test  room  to  allow  the  examinee  to  see  his  work  sheet,  without 
radically  decreasing  the  visibility  of  the  image  on  the  screen.  Each  ex¬ 
aminee  is  given  a  pencil  with  the  eraser  removed.  This  is  to  prevent  tally¬ 
ing  of  colors  each  time  they  appear,  on  the  theory  that  the  examinee 
will  not  do  so  if  he  cannot  erase  his  tally  marks. 

Although  the  examinees  are  not  led  to  expect  it  in  advance,  erasers 
are  distributed  at  the  beginning  of  the  transcription  period  so  that  errors 
in  transcribing  answers  to  the  IBM  answer  sheet  can  be  corrected. 

All  instructions,  except  those  for  transcribing  answers,  are  in  film 
subtitles.  There  is  no  sound  track. 

After  preliminary  instructions  are  given,  the  first  sample  series  ap¬ 
pears.  It  consists  of  1  blue  and  1  red  square,  shown  for  3  seconds.  Then 
a  gray  blank  background  appears  for  2  seconds,  followed  by  2  red 
squares.  Again  the  gray  background,  and  then  two  red  squares  and  one 
blue  appear.  Now  the  examinees  are  instructed  to  record  their  answers 
on  their  work  sheets.  While  answers  are  being  marked,  a  gray  back¬ 
ground  appears  on  the  screen  for  10  seconds.  As  the  items  become  more 
difficult,  this  answer  period  is  lengthened  to  15  and  then  to  20  seconds. 

The  answers  to  the  first  sample  series  arc  graphically  illustrated  when 
a  hand  appears  on  the  screen  and  writes  the  answers  in  the  proper  places 
on  a  work  sheet  At  the  same  time,  the  directions  read,  "For  sample  1, 
your  answers  should  be  5  red  and  2  blue.”  Another  sample  series  is  then 
presented  before  part  I  begins. 

(3)  Scoring. — The  scoring  formula  is  R— W/4. 

Statistical  results. — The  data  for  pilots  given  below  are  for  examinees 
in  classes  44E  and  44F  tested  at  Psychological  Research  Unit  No.  3  in 
October  and  November  1943  respectively.  The  data  for  navigators  are 
for  examinees  tested  at  ScJman  Field  from  June  19  to  June  22,  1944,  and 
at  Psychological  Research  Unit  No.  3  from  May  4  through -May  6,  1944. 

(1)  Distribution  statistics.— The  distribution  of  scores  in  this  test  is 
described  by  a  mean  of  62.7  and  a  standard  deviation  of  15.9,  for  parts 
I  and  II  combined,  based  on  a  sample  of  856  classified  pilots  in  davsts 
44E  and  44F. 

(2)  Internal  consistency. — The  internal  consistency  of  items  is  indi¬ 
cated  by  a  mean  phi  of  0.29  with  a  range  from  0.06  to  0.51  and  a  stand¬ 
ard  deviation  of  0.08,  based  on  the  highest  27  percent  and  the  lowest  27 
percent  of  473  classified  pilots  in  class  44F. 

(3)  Reliability  coefficient.— A  reliability  coefficient  of  0.77.  corrected, 
was  obtained  by  the  alternate- forms  method  (Part  I  v.  Part  If)  on  a 
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sample  of  4 73  classified  pilots  in  classes  44E  and  44F,  and  of  0.84  based 
on  668  navigation  students. 

(4)  Difficulty. — The  difficulty  level  of  items  in  the  test  is  indicated 
by  the  mean  proportion  of  correct  responses  equal  to  0.5 0,  with  a  stand* 
ard  deviation  of  0.18  and  a  range  from  0.12  to  0.94,  based  on  800  classi* 
fied  pilots  in  44F. 

(5)  Test  validity. — The  test  has  been  validated  for  both  pilot  and 
navigation  framing,  as  shown  in  table  10.8. 


Table  10.8. —  Validity  data  for  Complex  Concentration',  CldSS/IXt  with  the 
graduation-elimination  criterion 
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(6)  Right-wrong  correlations. — For  the  navigator  sample  the  correla¬ 
tion  of  rights  and  wrongs  scores  within  parts  I  and  II  were  —0.92  and 
—096;  between  parts  the  rights-wrongs  correlations  were  —0.68  and 
-069. 

Evaluation. — Complex  concentration  was  not  factor-analysed  with  the 
integration  battery,  because  it  had  not  been  ready  at  the  time  the  battery 
was  administered.  Validity  data  for  pilots  arc  sufficient  proof  that  the 
test  does  not  offer  promise  as  a  selection  instrument  for  that  specialty. 
For  navigators  a  satisfactory  level  of  validity  is  sliown,  and  because  the 
correlation  with  the  navigator  stanine  at  the  time  was  sufficiently  lov 
(approximately  0.30),  the  test  offered  some  degree  of  uniqueness. 

Code  Analysis,  Q653AX3  *• 

This  is  the  final  form  of  another  test  in  the  integration  area.  It  was 
designed  as  a  measure  of  speed  and  facility  in  understanding  and  an¬ 
alysing  interchangeable  symijols  and  keys  such  as  might  !>c  used  in  a 

code. 

Description. — In  each  item,  a  key  number  series  comj>oscd  of  five 
digits  is  presented.  Below  the  key  series  appear  five  other  series,  each 
composed  of  five  digits.  These  series  are  the  choices  from  which  the 
examinee  is  required  to  select  the  correct  answers.  The  general  task  of 
the  examinee  is  to  determine  those  alternative  series  that  contain  toe 
same  digits  as  tlie  key  series.  Some  problems  rail  for  the  determination 
of  alternatives  which  contain  all  the  digits  found  in  the  key  scries ; 
others,  for  alternatives  containing  four  and  only  four  of  the  digits 
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found  in  the  key  series;  or  three  and  only  three,  etc.  Which  of  these 
determinations  is  required  is  coded  within  each  key  scries  itself,  and 
must  be  determined  by  the  examinee.  Later  in  the  directions,  alphabetical 
substitutes  for  numbers  (A=l,  11=2,  etc.)  are  introduced.  Thus,  the 
examinee  must  interchange  letters  and  numlters  in  solving  Jhe  codes.  He 
must  be  aware  of  the  digits  (or  letters  and  digits)  of  the  original  series, 
the  interchangeability  of  letters  and  digits,  and  the  requirements  of  the 
particular  problem  in  selecting  the  proper  series  as  the  answers. 

(1)  Internal  characteristics. — The  test  consists  of  7  recorded  but  uti« 
scored  sample  items,  and  48  scored  items,  divided  into  2  parts  of  24 
items  each. 

(2)  Administration. — Thirteen  minutes  are  allowed  for  part  I  and 
11  minutes  for  part  II.  Examinees  are  paced  by  the  administrator  who 
informs  them  when  there  arc  but  6  minutes  lef  t  to  finish  part  I  and  alto 
when  there  are  but  5  minutes  to  finish  part  II.  Following  are  parts  of 
the  directions  and  sample  items.  The  words  in  italics  arc  oral  adminis¬ 
tration  directions  and  do  not  appear  in  the  test  booklet. 

This  is  a  test  of  your  quickness  at  umlcrstanding'and  analyzing  interchangeable 
symbols  and  keys  such  as  may  be  used  in  a  code. 

You  will  be  shown  a  key  number  series  composed  of  five  digits.  Below  this  will 
be  five  other  series.  These  will  be  the  five  dioices  from  which  you  are  to  select  the 
correct  answer  series  which  contains  the  digits  called  for  by  that  particular  item. 
Now,  work  sample  1. 

Sample  1.  In  which  of  the  following  scries  are  all  five  of  the  digits  the  same 
as  in  the  key  series  9  5  8  7  5  t 

A.  9  5  687. 

B.  99  8  7  5. 

C  5  7  8  5  4. 

D.  5  6  8  7  2. 

E.  5  5  9  8  7. 

(Administrator  reads  the  sample) 

E  is  the  correct  answer.  Alt  five  of  its  digits  are  the  same  as  the  digits  M  the 
key  series.  Notice  that  the  digits  in  the  answer  series  da  not  have  la  he  in  the  same 
order  as  in  the  key  series.  A  is  not  correct,  because  it  does  not  have  two  fives,  B 
does  not  have  ttvo  fives,  C  has  no  nine,  P  has  no  nine,  and  has  only  one  five.  Notice 
that  in  E  the  five  appears  IxAce,  the  same  as  it  does  in  the  key  series. 

Later  parts  of  the  instructions  read  as  follows: 

Up  to  this  point,  in  each  sample  problem  you  have  been  told  how  many  characters 
to  look  for  in  the  correct  answer  series.  Actually,  this  information  is  contained  m 
tlie  key  scries  itself.  The  first  five  digits  (1.  2,  J,  4.  and  5)  are  code  digits.  7,  8, 
and  9  are  not  cole  digits.  The  first  one  of  the  code  digits  that  appears  in  the  key 
scries  indicates  tlte  number  of  correstnnding  digits  in  the  correct  answer. 

Note  look  hack  at  sample  I.  The  first  tode  digit  that  appears  in  the  key  is  digit 
5.  This  indieatrs  that  all  five  of  the  digits  have  to  he  in  the  correct  answer.  Look 
at  sample  2.  The  first  code  digit  that  appears  in  the  key  is  the  digit  4.  This  indicates 
that  four  and  only  four  of  the  digits  in  the  key  series  are  found  in  the  correct  answer 
AW/.  Look  at  sample  J.  The  first  rode  digit  is  3,  indicating  three  of  the  digits  are 
found  in  the  correct  anr.yfr.  Look  at  sample  4.  The  first  code  digit  m  Use  key 
scries  is  the  digit  3.  This  indicates  that  three  digits  hate  to  be  found  in  the  correct 
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answer.  Obviously ,  6,  7,  8,  and  9  art  not  used  to  determine  the  number  of  corres¬ 
ponding  characters  in  the  answer ,  since  there  cannot  be  more  than  5  corresponding 
characters  in  a  series  containing  only  S  characters. 

•  •  •  Letters  as  well  as  digits  will  be  included  in  the  series.  The  following 
letters  are  used— -A,  D,  C,  D,  E,  F,  G,  H,  J.  The  first  five  of  these  letters,  L&,  A 
through  E,  are  code  letters.  Each  of  the  code  letters  can  be  substituted  for  its  corres¬ 
ponding  code  digit,  1  through  S  respectively.  That  is,  A  can  be  substituted  for  1, 
B  can  be  substituted  for  2,  C  for  3,  D  for  4,  and  E  for  5.  This  substitution  is  re* 
vcrsible;  thus,  1  can  be  substituted  for  A,  2  for  B,  3  for  C,  etc  The  letters,  F,  G, 
H,  and  J  are  not  interchangeable  with  the  digits  6,  7,  8,  9  and  no  substitution  can 
take  place 

In  the  following  problems,  the  first  code  digit  'or  its  equivalent  code  letter  found 
in  the  key  series  indicates  tlie  number  of  corresponding  characters  found  in  the  cor¬ 
rect  answer. 

If  the  key  is  HJG  4  3  2**  the  first  code  digit  is  four.  This  indicates  that  four  and 
only  four  of  the  characters  in  the  key  appear  in  the  right  answer.  If  the  key  is 
“6  C  A  D  2**  the  first  code  letter  is  G  Since  C  is  equivalent  to  three  this  indi¬ 
cates  that  three  and  only  three  of  the  characters  in  the  key  appear  in  the  right 
answer.  Now,  work  sample  S. 

Sample  &  F  E  5  3  A 
A.  H  3  3  3  A 
a  3  3  5  H3 
C  H  C  1  E  D 

D.  F  1  E  E  3 

E.  133HD 

D  is  the  right  answer.  Notice  that  E  is  the  first  code  letter  found  in  the  key,  and 
since  E  is  interchangeable  with  five,  this  indicates  that  all  five  of  the  characters  m 
the  key  series  are  found  in  the  correct  answer.  In  choice  D,  F  corresponds  to  P  m 
the  key,  1  corresponds  to  A,  E  corresponds  to  E,  and  the  second  E  corresponds  to 
five  and  3  corresponds  to  three. 

Remember  the  code  characters: 

ABODE 
1  2  3  4  5 

F,  H,  H,  J  are  not  interchangeable  with  anything.  Also  6,  7,  8,  9  ana  not  inter¬ 
changeable  with  anything. 

(3)  Scoring. — The  scoring  formula  R— W/4  is  used. 

Statistical  results. — The  data  given  arc  for  examinees  tested  at  Psy¬ 
chological  Research  Unit  No.  3  on  March  9  and  May  16,  1944,  and  on 
August  27,  1943. 

(1)  Distribution  statistics. — The  distribution  of  scores  in  this  test  is 
indicated  by  a  mean  score  of  14.9  and  a  standard  deviation  of  11.7, 
based  on  a  sample  of  285  unclassified  aviation  students. 

(2)  Reliability  coefficient. — A  reliability  coefficient  of  0.89,  corrected, 
was  obtained  by  the  alternate- forms  (part  I -part  II)  method  on  a  sam¬ 
ple  of  285  unclassified  aviation  students. 

(3)  Factorial  composition. — The  leading  factors  and  their  loadings 
are:  integration  HI  (0.42),  integration  II  (0.40),  numerical  (0.29), 
verbal  (023),  and  general  reasoning  (0.20).  The  conununality  (0.59) 
(alts  far  short  of  the  reliability. 
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Evaluation. — Code  Analysis  has  never  been  administered  for  valida¬ 
tion.  Its  factor  loadings,  however,  indicate  that  it  would  not  possess 
much  validity  for  pilot  success,  probably  not  over  0.07.  It  has  no  load¬ 
ing  greater  than  0.16  in  any  factor  of  known  pilot  validity.  If  the  two 
integration  factors  are  valid  for  navigator  selection,  however,  the  test 
would  have  high  navigator  validity,  for  the  combination  of  other  factors 
is  very  favorable. 

A  FACTOR  ANALYSIS  OF  INTEGRATION  TESTS  11 
In  order  to  gain  a  better  understanding  of  tests  in  this  area,  a  special 
factorial  study  was  made  of  integration  tests.  Integration  tests  were 
originally  predicated  on  the  hypothesis  that  the  most  valid  aspect  of  the 
Complex  Coordination  test  was  its  inc.'isurcmcnt  of  the  ability  to  observe 
a  complicated  situation  and  to  make  a  single  integrated  response  to  it. 
Subsequent  findings  as  related  in  this  chapter  liave  shown  that  while  the 
hypothesis  was  in  error,  it  was  none  the  less  fruitful  in  directing  re¬ 
search  into  virgin  areas.  The  ncwly-discovercd  territory  needs  additional 
illumination  through  factorial  study. 

The  Data 

In  addition  to  Complex  Coordination,  other  classification  tests  which 
had  been  recognized  as  good  factorial  reference  tests  were  included  in 
the  analysis.  These  tests  have  all  been  described  in  this  volume  except 
the  Two-Hand  Coordination  test.  This  test  uses  tin*  familiar  lathe-type 
machine  in  which  the  examinee  attempts  to  keep  a  contact  point  in  touch 
with  a  moving  button  which  follows  an  irregular  pathway  on  the  surface 
of  a  slowly  revolving  disc,  at  irregular  speeds.  The  right-and-left  and  to- 
and-fro  movements  of  the  contact  point  are  executed  independently  by 
turning  the  cranks  of  the  machine,  one  in  each  hand. 

In  addition  to  seven  experimental  integration  tests,  a  number  of  other 
experimental  tests  were  also  in  tlie  battery.  The  list  with  code  numbers 
may  be  seen  in  table  10.10.  It  includes  some  of  the  planning  tests  which 
were  of  particular  interest  at  the  time  and  some  new  reasoning  and 
spatial  tests.  Another  hypothesis  concerning  the  "intellectual  component 
of  the  Complex  Coordination  test"  was  that  it  is  a  space  factor  of  some 
kind,  hence  the  inclusion  of  spatial  tests. 

Two  tests  devised  for  this  battery  are  of  special  interest — 1-og  Book 
Accuracy  and  Marking  Accuracy  (sec  ch.  16).  It  was  thought  that  in 
many  of  the  integration  tests,  due  to  tl»c  great  amount  of  rapid,  clerical- 
type  work  involved,  part  of  which  is  in  the  use  of  the  answer  sheet, 
much  of  the  variance  would  be  taken  up  with  some  kind  of  simple  psr- 
chomotor  factor.  The  two  tests  were  accordingly  devised  to  isolate  that 
hypothetical  factor  and  to  determine  its  possible  variance  in  the  integra¬ 
tion  tests.  All  the  experimental  tests  included  in  the  integration  battery 
have  been  described  in  detail  in  this  volume. 
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Tarle  10. 10 — Centroid  /tutor  loading*  for  the  Integration  Battery1 


Teat 


I.  8pml  o I  IdMlitoliM, 
CWIOA . 

3.  Spatial  OrifdtliM  I, 

CP50IH . 

t,  Gmfil  InlnnnatMMi 

(TVN)  CE605D . 

4.  Ifawlinc  Comprebeaaioa, 

CI6I4IIX5 . 

I.  Mathematics  B,  CI206C. 
4.  Numerical  Operations, 
Front,  CI701B . 


3.  Numerical  Operations. 

Hack.  Cl  701 B . 

»,  Mechanical  Information, 

CI906A . 

I.  Mechanical  Principle*. 

CI0O3A . 

It.  SAM  Complei  Coordina- 

lion.  CM70IA . 

II.  Planning  Air  Maneuver*. 

Cl  40  HA  XI . 

IE  Planning  A  Course. 

Cl  tOUA  XI . 

IE  Instrument  Coenprehea 

•ion  I.  CI6I5A . 

14.  I  net  rumen  I  Compreben- 

eioa  II.  C16I6B. . 

IE  Ficure  Analogies. 

CI2I2AXI . 

IE  Spatial  Vimwiiiatioa  1, 

CI204AX1 . 

17.  Map  Distance.  CP62SII 

IE  Head#  V-l . 

IE  Cabaa  V-l . 

3E  Houle  Wanning. 

CI4IIAX1 . 

IE  Oreaniaal tonal  Planning. 
CI407BX. . . ...  7. 

It.  FaBewian  Oral  Direct  loan. 

CIUUX1 . 

SE-  PedUnin^  Dirertlena. 

24.  Code  Aneiyrifc  CI653AX2 
IE  Flight  Formation*. 

0454AX6 . 

IE  Farced  Ijutdep. 

CM52AX4 


17.  Signal  laternretatiaa. 

IE  SAM^TcMrJIand  CoocdE 

nation.  CM  101 A . 

It.  Combat  Planae. 

CUIMXJ . 

10.  log  Hook  Accuracy  XI .  - 
31.  Marking  Accuracy  XI .  .  . 
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The  correlation  matrix  is  presented  for  the  32  variables  in  table  10.9, 
the  centroid  loadings  and  commonalities  in  table  10.10,  and  the  rotated 
loadings  for  the  13  factors  in  table  10.11.  The  sample  was  composed  of 
266  classified  pilots.  At  the  time  these  pilots  were  classified,  no  consid¬ 
erable  selection  was  made  except  on  the  basis  of  mechanical-experience, 
visualization,  spatial-relations,  and  psychomotor-coord i nation  factors. 

The  Factors 

For  each  factor  the  list  of  tests  in  order  of  descending  loadings  is 
given.  Only  loadings  of  0.25  or  greater  are  included. 

Rotated  factor  I  is  defined  by  the  following  data : 


Teat  No. 

Teat  name 

Loading 

16 

9 

0  S3 

.47 

15 

20 

.36 

.32 

17 

4 

.31 

.30 

14 

.28 

This  is  the  well  established  visualization  factor,  which  is  conspicuous 
by  its  absence  from  integration  tests.  One  might  have  expected  it  to 
some  degree  in  all  of  them,  especially  in  the  Flight  Formations  test.  The 
latter  is  apparently  susceptible  of  successful  execution  without  the  aid 
of  visualization  of  the  type  conspicuous  in  Mechanical  Principles  and 
Spatial  Visualization  I  and  II. 

Rotated  factor  II  is  defined  by  the  following  data. 


Teal  No. 

Tc»t  nair? 

t 

Loading 

1 

Speed  of  Identification  . . . . 

0.66 

2 

Spatial  Orientation  1 . . . . . . . 

.62 

19 

Cubes  . . . . . 

4$ 

31 

Marking  Accuracy  . . . . . 

.IS 

This  is  clearly  the  perceptual -speed  factor,  which  almost  always  has 
loadings  above  0.60  in  the  two  leading  tests.  The  only  feature  of  interest 
here  is  that  the  Marking  Accuracy  test  is  about  12  percent  a  matter  of 
perceptual  speed,  whereas  from  its  appearance  it  would  seem  to  l»c  a 
rather  pure  test  of  speed  of  simple  motor  movement.  The  perceptual 
component  must  be  attributed  to  the  necessity  of  locating  positions  for 
narking  and  to  the  visual  control  oi  accurate  manipulations  of  the  pen¬ 
cil.  No  integration  tests  arc  significantly  loaded  with  this  factor,  even 
though  they  are  usually  speed  tests  and  require  attention  to  details. 
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Rotated  factor  III  is  defined  by  the  following  data: 


Test  No. 

Ten  name 

Loading 

7 

Numerical  Operations  (Back)  . . . 

0.75 

,60 

.31 

.30 

.33 

.13 

6 

Numerical  Operations  (Front)  . . . . 

21 

Organizational  Planning  ....1 . . . . 

5 

.Mathematics  B  . . . . 

30 

!-og  Book  Accuracy . . . . . 

12 

2* 

23 

Planning  A  Course  . . . . . . . . 

Code  Analysis  . . . 

Hollowing  Directions  . . . 

.» 

.20 

This,  the  numerical  factor,  has  somewhat  lower  factor  loadings  in  the 
classification  tests  than  usual.  This  may  indicate  that  other  tests  also 
have  reduced  loadings  in  it.  Since  pilots  were  not  selected  by  any  tests 
strongly  weighted  for  the  numerical  factor,  restriction  of  range  can 
hardly  be  blamed  for  this  state  of  affairs.  The  integration  and  planning 
tests  in  the  list  all  involve  the  use  of  numbers  in  an  elementary  fashion 
so  that  some  degree  of  saturation  with  this  factor  is  understandable.  It 
is  interesting  to  see  how  analysis  separates  sharply  between  Marking  Ac¬ 
curacy  and  Log  Book  Accuracy  with  respect  to  this  factor,  as  should  have 
been  expected  front  the  fact  that  in  the  latter,  item  numbers  were  in¬ 
volved. 


Rotated  factor  IV  has  significant  loadings  in  three  tests: 


Test  No. 

Test  name 

Loading 

0 

Mechanical  Information  . 

0.67 

9 

Mechanical  Principles  . . . . 

.49 

28 

Two-Hand  Coordination  . 

.29 

Here  the  use  of  classified  pilots  and  their  restriction  on  mechanical 
variance  is  quite  evident.  This  is  the  well-verified  mechanical-experience 
factor.  None  of  the  integration  tests  have  significant  loadings  in  it 
Rotated  factor  V  is  defined  by  the  following  data : 


Test  No. 

Test  tu  n* 

Loading 

0.71 

.60 

5 

29 

.40 

.30 

.27 

.26 

23 

.26 

This  is  the  verbal  factor,  which  has  no  serious  loadings  in  any  inte¬ 
gration  tests  except  Combat  Planes,  and  in  this  test  it  accounts  for  only 
9  percent  of  the  total  variance.  Vocabulary  or  verbal  comprehension  is 
thus  of  trifling  importance  in  these  integration  tests. 

Rotated  factor  VI  is  common  to  but  three  tests: 


Test  No. 

Test  name 

Loading 

0.59 

.57 

25 

Klicht  Kormationr . . 

.44 

221 


Here  is  decidedly  a  new  factor,  characteristic  of  three  of  the  integra¬ 
tion  tests.  None  of  the  three  is  a  pure  measure  of  it.  Flight  Formations 
comes  nearest  to  being  a  pure  measure,  since  its  secondary  loadings  are 
insignificant.  Its  communality,  however,  is  low,  and  its  variance  in  this 
factor  is  only  21  percent.  If  one  were  to  attempt  to  develop  a  pure  test 
of  this  factor,  either  this  form  would  be  cultivated  or  one  would  attempt 
to  rid  the  other  tests  in  this  list  from  the  intolerable  secondary  variances. 
The  factor,  for  the  present,  may  be  called  integration  I  until  more  is 
known  regarding  it. 

The  chief  thing  that  these  tests  have  in  common  is  the  requirement 
for  the  examinee  to  memorize  and  to  retain  a  number  of  rules  which 
must  be  followed  in  responding  to  the  items.  From  this  consideration  the 
variable  might  be  defined  as  a  memory  factor.  It  is  possible  that  there 
is  a  factor  having  to  do  with  the  retention  of  verbal  instructions  and 
that  it  is  common  to  these  tests  and  the  Memory  for  Tactical  Plans  test 
(see  ch.  11).  No  correlations  are  available  with  which  to  test  this 
suggestion. 

Rotated  factor  VII  is  another  one  prominent  in  some  of  the  integra¬ 
tion  tests: 


TcstN* 

Test  name 

Loading 

23 

Following  Directions  . . . . . 

ASS 

24 

Code  Analysis . . . T _ 

•40 

26 

Forced  landings . . . 

.38 

21 

Organisational  Planning . 

.35 

4 

Reading  Comprehension  . 

.25 

22 

Following  Oral  Directions . . . 

.as 

The  distinguishing  feature  of  the  leading  tests  in  this  list  is  appar¬ 
ently  an  ability  to  adapt  quickly  to  new  instructions  and  to  carry  them 
out  successfully.  Almost  every  item  introduces  new  variations  or  modifi¬ 
cations  of  general  instructions  given  at  the  beginning.  There  is  some 
necessity  to  retain  mental  sets,  but  not  for  nearly  so  long  periods  as  in 
the  case  of  factor  VI  just  described.  It  would  be  highly  desirable,  how¬ 
ever,  to  correlate  these  tests  with  memory  tests  in  order  to  determine  the 
possible  identity  of  this  factor  with  some  memory  factor.  Until  further 
information  is  forthcoming,  it  is  best  to  name  this  factor  integration  II. 

Rotated  factor  VIII  is  strong  in  only  two  tests: 


Test  Ns. 

Test  rum* 

Loading 

28 

Tvo'Hand  C^vdinitloit  . a..., . . . . 

0.44 

30 

Complex  Coordination  . . . . . 

.45 

In  spite  of  the  very  small  number  of  tests  with  which  to  define  this 
factor,  it  is  probably  the  psychomotor  coordination  factor,  held  in  com¬ 
mon  with  the  Rotary  Pursuit  Test  and  Finger  Dexterity  as  shown  in 
other  analyses  (seech.  28). 
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Rotated  factor  IX  is  also  restricted  to  two  tests: 


Ten  No. 

Ten  mm 

t« odl*g| 

SO 

i-of  Book  Accurx?  . . . . . 

MO 

J1 

Marking  Accuracy  . 

_ 

This  factor  might  be  called  marking  speed,  in  accordance  with  one 
simple,  obvious  aspect  of  these  two  tests.  It  may  be  some  independent 
type  of  clerical  ability,  but  there  is  also  the  possibility  that  it  is  a  much 
more  restricted  ability,  such  as  speed  of  simple  motor  reactions  as  was 
found  in  certain  factor-analysis  studies  preceding  the  war. 

Giving  credence  to  the  last-mentioned  hypothesis,  the  writer  favors 
the  name  psychomotor  speed  for  this  factor. 

Combat  Planes,  the  highly  speeded  integration  test  which  involves 
direct  marking  of  the  answer  sheet,  also  has  a  small  loading  on  this 
factor.  It  is  clear  that  psychomotor  speed  docs  not  enter  into  other  inte¬ 
gration  tests  to  any  appreciable  degree  and  so  docs  not  add  to  the  com¬ 
plexity  of  the  tests,  as  had  been  feared. 

Rotated  factor  X  involves  a  number  of  integration  tests: 


Te*t  No.  I  Tnt  hum 


26  Forced  Landing*  . . 

J  Mathematic*  B  . 

27  Signal  interpretation . . 

16  Spatial  Viiualtiation  I  . . 

IS  Figure  Analogic*  . 

29  Combat  lianc* . . . . . 

12  Manning  A  Court* . . . 

22  Following  Oral  Direction*  . 

J  Technical  Vocabulary  (narigalion  »frc) 


IS) 

M 

.«1 

.M 

J) 

M 

M 

M 


This  is  the  usual  general- reasoning  factor.  There  arc  two  other  rea¬ 
soning  factors,  each  of  a  more  rcs:rictcd  nature.  This  one,  which  has 
always  been  prominent  in  Mathematics  B  (arithmetic  reasoning),  shows 
up  strongly  in  several  of  the  integration  tests.  Forced  Landings  in  par¬ 
ticular  appears  to  be  as  good  a  measure  of  it  as  Mathematics  B.  The 
items  in  forced  landings  arc,  after  all,  simple  arithmetical-reasoning 
problems,  in  which  the  number  work  is  so  simple  that  the  number  vari¬ 
ance  drops  out.  As  a  result  of  the  findings  here,  wc  conclude  that  a  num¬ 
ber  of  the  integration  tests  would  be  valid  for  selection  of  navigators  but 
would  not  be  aided  by  reason  of  general-reasoning  variance  for  the 
selection  of  pilots. 

Rotated  factor  XI  is  defined  by  the  following  data: 


Ten  No. 

Ten  boom 

SB 

It 

.46 

11 

.4) 

.» 

M 

.14 

.JO 

11 

29 

_ df 
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This  factor  list  includes  a  combination  of  integration  and  reasoning 
tests.  It  seems  to  be  identical  with  a  factor  isolated  in  two  other  analyses 
of  the  Nonverbal  Reasoning  battery  (see  eh.  7)  and  the  Foresight  and 
Planning  II  battery  (see  eh.  9).  The  tests,  with  two  possible  exceptions 
(Figure  Analogies  and  Spatial  Visualization  I),  seem  to  have  in  common 
the  necessity  for  keeping  in  mind  a  number  of  detailed  considerations 
provided  either  from  the  instructions  or  from  the  objects  used  in  the 
items.  Failure  to  take  into  account  alt  considerations  leads  almost  in¬ 
evitably  to  an  incorrect  response.  One  hypothesis  would  be  that  it  is  a 
span  of  apprehension  or  a  scope  of  apprehension.  Another  might  be  that 
it  involves  mastery  of  details.  A  third  hypothesis,  somewhat  different 
from  the  other  two,  is  that  the  factor  is  ideational  fluency,  the  ease  with 
which  the  individual  can  think  of  new  possible  responses.  This  ability 
would  provide  a  distinct  advantage  in  most  of  these  tests  except  Code 
Analysis,  Planning  a  Course,  and  perhaps  Signal  Interpretation  and  Com¬ 
bat  Planes,  Until  further  definitive  evidence  is  available,  it  is  best  to 
name  the  factor  integration  III. 

Rotated  factor  XII  is  defined  by  a  single  test,  namely,  Map  Distance, 
which  has  a  loading  of  only  0.28.  This  factor  might  have  been  regarded 
as  a  residual,  except  for  the  fact  that  there  is  too  much  spread  in  the 
loadings  and  it  is  possible  to  find  concordant  results  in  other  analyses. 
This  leads  to  the  suggestion  that  it  is  the  length-estimation  factor  in 
which  Map  Distance  has  previously  shown  a  loading  of  0.31.  Spatial 
Orientation  I  is  the  only  other  test  in  the  present  battery  that  has  a 
loading  with  the  factor  approaching  significance.  A  small  amount  of 
length  estimation  in  this  test  could  be  rationalized. 

Rotated  factor  XIII  is  defined  by  the  following  data: 
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This  is  tlie  spatial-relations  factor  originally  called  the  "intellectual 
component  of  the  Complex  Coordination  test."  For  classification  tests 
the  loadings  here  are  somewhat  lower  than  the  normal  levels,  due  un- 
doubtrdly  to  the  selection  of  the  pilots.  The  loadings  in  general  might, 
therefore,  be  higher  in  an  unsclectcd  sample.  However  this  might  be,  it 
a|>|M*ar$  that  l*>th  forms  of  Instrument  Comjwehcnsion  are  better  tests  of 
the  factor  than  is  Complex  Coordination.  The  Planning  a  Course  test, 
which  in  son»c  respects  was  to  duplicate  the  fundamental  nature  of  the 
Complex  Coordination  test  on  pa|>cr,  did  not  measure  up  to  its  model  with 
respect  to  the  measurement  of  spatial  relations.  It  had  a  loading  of  0.62 
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in  this  factor  in  the  forcsight-and-planning  analysis  (see  eh.  9),  how¬ 
ever,  so  that  this  statement  must  be  made  with  reservations.  Its  loading 
here  suffers  somewhat  from  the  general  restriction  in  common  with  other 
space  tests.  It  is  interesting  to  note  in  pas  ing  that  the  Hands  test  did  not 
have  an  appreciable  loading  in  this  space  factor.  Results  elsewhere  (see 
p.  417)  will  show  that  it  better  represents  another  space  factor. 
Conclusions 

Several  important  deductions  and  implications  can  be  drawn  from  the 
results  of  this  analysis. 

In  the  first  place,  the  chief  feature  of  the  Complex  Coordination  test, 
and  one  of  the  aspects  that  makes  it  valid,  is  not  an  integration  ability, 
as  had  been  hypothesized.  Its  chief  variances  which  contribute  validity 
for  pilot  selection,  and  for  so  many  other  kinds  of  predictions,  are  its 
spatial-relations  and  psychomotor-coordination  factors,  and  to  a  smalt 
degree  ita  variances  in  visualization  and  perceptual  speed. 

In  spite  of  the  fact  that  the  hypotltesis  was  wrong,  it  was  fruitful  in 
leading  to  test  development  in  a  new  area  and  to  some  understanding 
of  that  new  area.  The  three  integration  factors  uncovered  by  this  analy¬ 
sis  become  the  starting  points  for  new  explorations  in  individual  dif¬ 
ferences.  One  factor  seems  to  represent  the  effective  persistence  of  a 
complicated  mental  set  which  operates  in  rapid,  complex,  clerical-type 
work  (integration  I).  A  second  factor  seems  to  represent  an  adaptability 
of  mental  set;  the  trait  of  being  able  to  modify  sets  on  short  notice  (in¬ 
tegration  II).  One  might  be  tempted  to  call  it  flexibility  of  set  (absence 
of  perseveration),  except  for  the  fact  that  flexibility  (or  perseveration) 
tests  have  notoriously  failed  to  intercorrelatc  to  any  substantial  degree 
(see  eh.  20  for  an  example  of  this).  The  third  factor  may  represent 
some  kind  of  span  or  scope  of  apprehension  or  attention ;  the  ability  to 
keep  all  elements  in  a  set  operating  effectively  (integration  III).  It  will 
require  further  test  development  to  examine  all  these  hypotheses  effec¬ 
tively. 

As  a  group,  integration  tests  proved  to  be  nonvisualizing  (at  least 
visualization  of  the  manipulatory  type),  nonpcrccptual  (in  the  pcrcep- 
tual-spccd  sense),  mostly  nonnumcrical,  nonmcchanical,  and  mostly  non¬ 
verbal.  Neither  arc  they  given  to  variances  in  the  psychomotor-spccd 
factor.  The  only  better  known  factors  with  which  many  of  them  are  sec¬ 
ondarily  involved  arc  general  reasoning  and  spatial  relations  (to  which 
reference  lias  already  been  made).  The  involvement  with  general  rea¬ 
soning  often  comes  about  in  many  a  complex  task,  particularly  when 
difficulties  are  encountered  and  responses  are  not  obvious  by  way  of  vis¬ 
ualization. 
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CHAPTER  ElEVEI 


Memory  Tests1 


MEMORY  IN  AVIATION 
Memory  in  Aviation  Training 

Training,  no  matter  of  what  type,  implies  learning  and  memory.  It  is 
therefore  appropriate  to  expect  that  memory  ability  or  memory  abilities 
should  be  important  in  many  phases  of  the  education  of  air-crew  train¬ 
ees.  When  the  operations  of  their  training  are  observed  and  analysed, 
this  conviction  is  even  greater.  Ground-school  training  requires  the  avia¬ 
tion  student  to  absorb  large  quantities  of  factual  material  under  pressure 
in  limited  time.  He  is  expected  to  remember,  and  to  use  later  in  flying, 
the  information  that  is  imparted  to  him  in  his  classroom  work.  In  learn¬ 
ing  to  operate  within  his  specialty  in  the  air,  he  must  also  acquire  a 
great  number  of  skills  that  he  did  not  possess  before.  These  skills  must 
be  stamped  in  with  repeated  drill,  and,  if  possible,  over-learned  to  the 
point  where  he  may  perform  automatically  on  occasion,  resisting  the 
effects  of  distraction  or  of  stress. 

Memory  in  Combat  Operations 

In  combat  it  is  expected  and  hoped  that  what  the  air-crew  member 
learned  in  the  way  of  factual  information  and  in  the  way  of  motor  skills 
will  be  sufficiently  retained,  tad  reinstated  with  sufficient  facility  for 
him  to  perform  the  necessary  operations  for  which  he  spent  many  months 
of  preparation.  It  is  also  true  that  training  never  ceases  after  flying  per¬ 
sonnel  have  passed  beyond  the  stages  designated  as  training.  In  other 
words,  to  maintain  proficiency  and  to  improve  proficiency,  the  individual 
must  acquire  new  information  and  skiffs  and  must  perfect  skills  that  he 
could  not  practice  completely  before. 

In  addition  to  the  maintenance  and  improvement  of  proficiency  in  his 
job,  the  flying  soldier  goes  through  periods  of  briefing  in  which  he  is 
expected  to  note  and  to  remember  the  important  facts  concerning  the 
mission  he  is  about  to  fly.  He  must  remember  his  orders  and  specific  fea¬ 
tures  of  the  mission  that  are  not  carried  with  him  in  live  form  of  written 
or  pictorial  material.  He  must  be  able  to  identify  features  of  the  land¬ 
scape,  if  necessary,  as  well  as  friendly  and  enemy  aircraft  that  may  ap¬ 
pear.  Returning  from  his  mission,  he  should  be  able  to  remember  and 
to  relate  to  interrogation  personnel  the  important  details  that  they  wish 
to  know.  From  the  beginning  of  training  through  to  the  "pay-off1*  in 
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combat,  successful  performance  would  seem  to  depend  to  a  very  large 
degree  on  the  efficiency  of  the  memory  of  each  flying  individual. 

Job  Analysis  Data 

It  is  not  necessary  to  enumerate  one  by  one  the  many  things  that  must 
be  remembered  during  air-crew  training.  Such  a  listing  of  specific  acts 
of  memory  would  be  superfluous  and  the  role  of  memory  can  be  taken 
for  granted  without  it.  Objective  data  concerning  the  relative  importance 
of  remembering,  however,  should  be  examined  before  a  full  conclusion 
is  reached  regarding  its  place  in  air-crew  performance.  There  are  a 
variety  of  data  that  can  be  cited  in  connection  with  various  phases  of 
training  and  combat 

Memory  Deficiencies  in  Training 

In  the  data  on  1,000  primary  pilot  elimination-board  cases,  it  was  re¬ 
ported  that  24  percent  of  the  cadets  exhibited  memory  deficiencies  suffi¬ 
ciently  serious  to  be  mentioned.  Some  of  the  typical  notations  are  as 
follows:  "Does  not  retain  instruction,”  "requires  repeated  demonstra¬ 
tions,”  "repeats  mistakes  from  day  to  day,"  "forgets  fuel  dial  after  one 
look,"  "fails  to  switch  tanks,"  "forgets  wind  direction,"  "forgets  flaps," 
"forgets  to  look  back  at  tee  on  take-off,”  "forgets  to  notice  tee  on  land¬ 
ing,"  and  "neglects  reference  points  on  wing."  In  another  sample  of 
1,303  primary  diminecs  for  whom  ratings  had  been  given  on  the  pilot 
rating  scale  of  the  Air  Force  Training  Command,  39  percent  were 
checked  as  having  memory  deficiencies. 

No  data  are  available  for  basic  training,  but  in  advanced  single-engine 
training  52  cut  of  100  eliminees  were  reported  as  showing  memory  de¬ 
fects,  and  this  represented  8  percent  of  all  comments  made  concerning 
them.  In  advanced  twin-engine  training,  38  out  of  100  had  mentions  of 
memory  deficicndes,  which  represented  7  percent  of  all  comments.  In 
operational  training,  of  100  pilots  who  Vere  reclassified  for  insufficient 
proficiency,  3  percent  were  reported  as  having  memory  defects. 

Memory  and  the  Bomber  Crew 

The  relative  importance  of  memory,  as  judged  by  supervisory  officers 
of  combat  personnel  in  the  Eighth  Air  Force,  is  indicated  by  ratings 
made  on  a  scale  of  nine  points.  In  evaluating  the  importance  of  memory 
for  bomber  pilots,  the  average  rating,  as  judged  by  74  observers,  was 
6.4  when  the  range  of  average  ratings  for  other  qualities  was  from 
4.1  to  7.5.  The  average  rating  for  navigators,  as  judged  by  57  observers, 
was  6.9  when  the  means  of  other  traits  ranged  from  5.0  to  8.0.  The 
average  rating  for  bombardiers,  as  judged  by  31  observers,  was  7.0 
with  a  range  for  other  traits  of  5.3  and  8.0.  Memory  ranked  ninth 
among  20  psychological  requirements  for  the  bomber  pilot,  being  tied 
with  the  trait  called  "estimation  of  speed  and  distance."  It  was  tied  for 
eighth  place  along  with  arithmetic  calculations  and  leadership  in  the  case 
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of  navigators,  and  for  sixth  place  along  with  division  of  attention  and  * 
finger  dexterity  for  bombardiers.  In  other  words,  its  position  among 
traits  in  general  is  regarded  as  higher  than  average. 

RESEARCH  ON  MEMORY  TESTS 
A  Systematic  Plan 

While  a  considerable  amount  of  experimental  work  has  been  done  tit 
the  field  of  memory,  and  the  techniques  of  memory  research  are  numer¬ 
ous  and  well  known,  no!  a  great  deal  is  known  concerning  individual 
differences  with  respect  to  memory  performance.  Previous  factor-analy¬ 
sis  studies  have  at  least  demonstrated  the  fact  that  memory  ability  is  not 
a  single  trait  in  which  people  differ,  but  rather  that  there  are  a  number 
of  memory  abilities.  (1)  It  is  not  apparent  when  one  considers  the 
many  learning  activities  and  the  many  situations  calling  for  remember¬ 
ing  in  connection  with  combat  aviation  just  what  specific  memory  fac¬ 
tors  are  most  important.  It  seemed  desirable,  therefore,  to  make  a  rather 
comprehensive  and  searching  survey  of  this  area — within  the  time  per¬ 
mitted  by  the  urgency  of  the  military  situation — in  order  to  be  sure  that 
the  important  memory  variables  would  be  investigated.  This  implies  a 
"shot  gun"  approach,  but  it  was  not  by  any  means  a  completely  blind 
approach.  There  was  not  sufficient  time  to  explore  all  possible  avenues, 
and  there  were  restrictions  in  terms  of  the  type  of  memory  task  that 
could  be  suited  to  the  routine  of  classification  testing.  Within  these  limi¬ 
tations,  a  rather  extensive  plan  of  research  was  evolved. 

The  plan  included  factor  analyses  of  two  batteries  of  memory  tests  in 
order  to  determine  what  fundamental  variables  were  important  and 
which  tests  were  most  saturated  in  them.  Since  the  time  available  for 
validation  testing  was  limited  at  this  period  of  research,  only  those  testa 
with  the  highest  factor  loadings  in  memory  abilities  were  to  be  vali¬ 
dated.  As  it  happened,  the  opportunity  to  validate  most  of  them  was 
later  provided. 

Features  of  Memory  Tasks 

If  one  recalls  the  various  laboratory  techniques,  such  as  memory  span, 
paired  associates,  serial  learning,  and  the  like,  and  if  one  also  considers 
the  various  types  of  materials  and  the  many  methods  for  measuring  the 
amount  of  retention,  recall,  and  recognition,  the  lines  of  test  possibili¬ 
ties  tend  to  become  dearer. 

Types  of  materitd— The  favorite  types  of  material  are  few.  Verbal 
material  is,  perhaps,  the  most  common.  It  may  be  either  meaningful  or 
nonsensical,  and  it  may  be  presented  in  printed  or  in  oral  form.  Pictorial 
material  may  be  either  schematic  or  photographic,  and  in  meaningful  or 
in  relatively  meaningless  form.  The  things  to  be  memorired  may  be  pre¬ 
sented  in  any  of  these  forms  in  group  testing,  and  the  final  test  of  effi¬ 
ciency  of  retention,  recall,  and  recognition  may  also  be  given  in  term* 
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of  any  of  the  same  types  of  material.  Presentation  and  final  test  activity 
may  be  in  terms  pf  the  same  kind  of  material  and  the  same  modality  or 
type  of  perceptual  or  motor  performance,  or  they  may  differ  in  these 
respects  in  many  combinations.  In  selecting  test  ideas  for  development, 
the  kinds  of  combinations  most  common  to  air  crew  were  given  highest 
priority. 

Within  the  limitations  imposed  by  group  testing,  it  was  not  possible 
to  set  up  the  conditions  for  assessing  individual  differences  in  the  reten¬ 
tion  and  use  of  motor  skills  and  other  habits.1  The  approach  described 
in  this  chapter  voluntarily  restricts  itself  to  the  type  of  memory  task  that 
can  be  applied  in  group  testing.  It  will  be  found  that  the  tests  that  follow 
depend  frequently  upon  learning  by  paired  associates.  The  material  is 
either  pictorial  or  verbal  The  final  measurement  of  memory  proficiency 
is  in  terms  of  recognition  tests — multiple-choice  and  matching  types  of 
response.  This  restriction  was  imposed  by  the  use  of  the  answer  sheet 
It  is  upheld  by  the  conviction  that  efficiency  of  recognition  and  of  recall 
are  very  highly  correlated,  at  least  within  the  limits  of  the  retention  in¬ 
tervals  utilized  in  the  tests. 

Immediate  vs.  delayed  recall. — Most  memory  tests  heretofore  have 
required  recall  or  recognition  only  after  relatively  short  time  intervals. 
Practical  considerations  have  usually  demanded  this  type  of  test  The 
memory  involved  in  air-crew  performance,  however,  is  of  the  delayed 
rather  than  the  immediate  type.  It  may  be  that  the  two  are  not  very 
highly  correlated.  Since  classification  testing  of  aviation  students  had  to 
be  completed  within  a  2-day  period,  it  would  have  been  possible  to  in¬ 
sert  an  interval  of  24  hours.  This  would  not  have  been  very  convenient, 
since  all  group  testing  was  confined  to  1  day.  The  only  test  utilizing 
more  than  a  few  seconds  delay  between  observation  and  recognition  test 
was  one  that  involved  an  interval  of  approximately  2  hours.  During  this 
interval,  students  were  occupied  with  other  tests.  Had  the  interval  been 
longer,  even  extending  through  the  noon  mess  period,  there  would  have 
been  opportunity  for  extraneous  factors  to  disturb  the  reliablity  of  the 
recognition  test  The  intervening  of  nonstandard  activities  between  the 
impression  and  test  of  retention  and  recall  has  always  been  a  disturbing 
feature  of  long-interval  memory  testa. 


Fact  Validity 

It  was  quite  easy  to  apply  the  principle  of  face  validity  to  memory 
tests.  Pictures  of  planes  and  their  names,  landmarks  as  seen  from  the 
air  paired  with  names,  and  aerial  maps  to  be  remembered  by  name  or  by 
visual  features,  provided  a  wealth  of  material.  A  set  of  orders  for  a 
mission  presented  orally  provided  a  simulated  briefing.  Identification  of 
ships,  aircraft,  and  of  landmarks  as  in  pilotage,  were  represented  in  sev- 
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eral  of  the  tests.  Retention  of  verbal  instructions  was  represented  in 
others.  The  pertinence  of  most  tests  should  be  apparent  to  any  observer 
who  is  conversant  with  military-aviation  requirements. 

Classification  of  memory  tests. — The  tests  included  in  this  chapter  are 
classified  logically  on  the  basis  of  (<i)  type  of  material,  ( b )  nature  of 
the  task  involved,  and  ( c )  manner  of  presentation  and  response.  There 
are  two  main  kinds  of  material;  (1)  pictorial  and  (2)  symbolic.  Each  of 
these  main  categories  is  subdivided  into  (I)  tests  which  require  mem¬ 
ory  for  complex  wholes  and  relations  of  parts  and  (2)  those  which  re¬ 
quire  memory  for  simple  wholes  (paired  associates).  Each  of  these  di¬ 
visions  may  be  further  broken  down  according  to  m.  nner  of  presenta¬ 
tion  of  and  response  to  the  items. 

PICTORIAL  MEMORY  TESTS 

Placed  under  this  category  are  those  tests  that  involve  the  ability  to 
remember  and  to  recognize  material  of  a  nonverbal,  pictorial  nature,  in¬ 
cluding  both  complex  wholes  and  relations  of  parts,  and  simple  wholes 
(paired  associates).  Tests  involving  memory  for  complex  wholes  and 
relations  include  those  presenting  a  pictorial  stimulus  and  pictorial  re¬ 
sponse,  and  those  presenting  a  pictorial  stimulus  with  a  verbal  response  or 
question.  Tests  involving  memory  for  simple  wholes  include  those  pre¬ 
senting  a  pictorial-verbal  stimulus  and  a  pictorial- verbal  response,  where 
the  memory  is  primarily  for  the  pictorial  element. 

Rationale 

As  pointed  out  earlier  in  this  chapter,  air-crew  personnel  must  carry 
with  th^m,  mentally,  certain  information  necessary  for  the  success  of  the 
mission.  Much  of  this  material  is  nonverbal,  nonsymbolic,  or  pictorial 
Some  of  this  pictorial  material  consists  of  recently  acquired  and  com¬ 
plex  information,  such  as  maps  of  enemy  territory  with  landmarks,  tar¬ 
gets,  and  other  identifying  features.  Orientation  to  the  terrain  and  ob¬ 
jectives  of  the  mission  requires  the  recognition  of  identifying  features 
as  previously  studied  in  maps  of  the  territory.  In  addition,  after  the 
mission  is  accomplished,  memory  for  the  events  of  the  trip  in  terms  of 
the  territory  flown  over  is  important  in  the  accurate  evaluation  of  re¬ 
sults.  The  measurement  of  this  type  of  memory  is  attempted  in  the 
Map-Memory  tests  which  follow  and  the  Memory  for  landmarks  test, 
which  requires  the  remembering  and  recognition  of  single  identifying 
features. 

Memory  for  complex  wholes  and  relations  is  also  important  in  the 
routine,  mechanical  performance  of  air-crew  duties  under  conditions 
where  attention  is  diverted  by  other  necessary  activities  or  distractions. 
Thus,  the  pilot  must  know  his  instruments  and  their  relative  positions 
by  memory,  so  that  he  can  operaie  them  without  seeing  them,  when  nec- 
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cssary.  This  type  of  memory  is  represented  by  the  Memory  for  Instru¬ 
ment  Board  test  dcscriK-d  below. 

Another  type  of  pictorial  memory  is  necessary  for  the  quick  recogni¬ 
tion  and  differentiation  of  enemy  and  friendly  aircraft  and  ships.  Al¬ 
though  to  a  certain  extent  proficiency  in  this  area  requires  constant 
learning  as  new  types  of  aircraft  and  ships  are  {jut  into  combat  service, 
much  of  it  represents  the  memory  for  outlines,  forms,  and  identifying 
characters  learned  early  in  training.  This  area  is  sampled  by  the  Memory 
for  Hanes  and  Memory  for  Ships  tests  described  in  this  chapter. 


STUDY  MAP  AND  TEST  ITEMS  OF  MAP  MEMORY, 

CX  505 AX I 
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Map  Memory,  CI505AX1  * 

This  is  a  pictorial-mernory  test  in  which  the  stimulus  is  pictorial  and 
the  response  is  in  verbai  form.  It  was  designed  to  measure  the  ability  to 
remember  complex  wholes  and  relations  of  parts. 

Description.  ( 1 )  Internal  characteristics. — The  test  consists  of  2  parts, 
each  in  a  separate  booklet  containing  60  items.  Part  I  contains  a  sample 
map  with  6  practice  items  and  3  test  maps,  each  followed  by  20  items 
concerning  each  map.  Part  II  consists  of  3  additional  maps  of  the  same 
type  followed  by  20  items  each.  Each  item  has  five  alternative  verbal 
responses  which  distinguish  this  test  from  the  visual  form  described  later. 
Sample  questions  and  a  section  of  the  appropriate  map  arc  shown  in 
figure  11.1. 

(2)  Administration. — Instructions  for  the  test  inform  the  examinee 
that  the  test  is  a  measure  of  his  ability  to  remember  details  of  a  map 
which  he  is  to  study  for  a  brief  period  of  time.  He  is  directed  to  note 
particularly,  in  his  study  of  the  large  map,  features  which  will  enable 
him  to  remember  any  section,  such  as : 

1.  Names  of  places  and  things. 

2.  Locations  of  places  and  things  in  relation  to  each  other. 

3.  Compass  directions,  e.  g,,  location  of  one  part  of  the  map  as  north 
of  another  part. 

4.  Important  routes  by  road,  rail,  air,  etc.,  from  one  part  of  the  map 
to  another. 

5.  Number  of  times  certain  important  objects  occur  in  the  map.  The 
total  time  limit  is  45  minutes  for  part  I  and  41  minutes  for  part  II.  Two 
minutes  are  allowed  for  study  of  the  sample  map  in  part  I,  followed  by 
2  minutes  for  answering  the  six  sample  items.  Each  of  the  test  maps  in 
part  I  is  studied  for  4  minutes,  and  8  minutes  arc  allowed  for  answering 
each  set  of  20  items.  In  part  II,  5  minutes  are  allowed  for  the  study  of 
each  map  and  7  minutes  for  answering  each  group  of  20  items.  The 
administration  of  part  I  and  part  II  of  the  test  is  separated  by  the  ad¬ 
ministration  of  other  tests  in  the  battery  in  order  to  decrease  the  effect  of 
proactive  inhibition  or  other  interferences. 

(3)  Scoring.— The  scoring  formula  is  R— W/4. 

Statistical  results— The  data  given  below  arc  for  a  group  of  179 
classified  bombardiers  and  259  unclassified  aviation  students  tested  at 
Psychological  Research  Unit  No.  3  in  September  1942. 

(1)  Distribution  statistics.—  For  358  cases  (bombardiers  and  unclas¬ 
sified  students),  the  mean  raw  score  (for  parts  I  and  II,  120  items)  was 
64.8,  with  a  standard  deviation  of  18.2  and  a  range  from  18  to  105. 

(2)  Internal  consistency.— For  the  bombardier  sample,  the  phi  coeffi¬ 
cients  ranged  from  —0.10  to  0.55,  with  a  mean  of  0.33  and  a  standard 
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deviation  of  0.11  for  part  I.  For  part  II,  the  range  was  from  0.01  to 
0.57,  with  a  mean  of  0.30  and  a  standard  deviation  of  0.11.  For  an  N 
of  179  (upper  and  lower  halves  of  approximately  90  were  employed)  a 
phi  of  approximately  0.14  is  required  for  significance  at  the  5  percent 
level,  and  a  phi  of  0.20  for  significance  at  the  1  percent  level.  Although, 
as  a  whole,  the  test  showed  fair  consistency,  many  items  were  lacking  in 
discrimination  between  the  upper  and  lower  criterion  groups. 

(3)  Reliability  coefficients. — The  correlation  of  part  I  with  part  II 
was  0.71,  which  yields  a  coefficient  of  0.84  when  corrected  for  double 
length.  The  group  of  179  bombardiers  was  used  for  the  computation. 

(4)  Difficulty. — Difficulty  indices  (N=179)  ranged  from  0.19  to 
0.96,  with  a  mean  of  0.58,  corrected  for  chance,  and  a  standard  deviation 
of  0.17  for  part  I;  and  from  0.21  to  0.93,  with  a  mean  of  0.61,  cor¬ 
rected  for  chance,  and  a  standard  deviation  of  0.16  for  part  II.  Diffi¬ 
culty  is  therefore  satisfactory,  although  some  of  the  items  are  perhaps 
too  difficult  and  others  somewhat  too  easy. 

(5)  Factorial  composition. — The  most  significant  loadings  arc  in  the 
visual-memory  (0.54),  verbal  (0.42),  perceptual-speed  (0.35),  and  vis¬ 
ualization  (0.26)  factors.  The  communality  is  0.70.  For  a  fuller  picture 
of  the  factorial  composition  of  this  test,  see  appendix  B. 

(6)  Test  validity. — A  sample  of  212  pilots  yielded  a  biserial  correla¬ 
tion  of  -—0.16  between  performance  in  this  test  and  graduation-elimina¬ 
tion  in  pilot  training.  The  mean  score  for  the  graduates  was  29.00,  for 
eliminees  32.05,  and  the  standard  deviation  foi  both  combined  was 
11.04.  Of  this  sample  75  percent  were  graduates. 

Evaluation. — Factor  analysis  of  this  test  shows  that  70  percent  of 
total  variance  is  accounted  for  by  common  factors.  The  visual-memory 
factor  accounts  for  29  percent,  the  verbal  factor  for  18  percent,  the  per¬ 
ceptual-speed  factor  for  12  percent,  and  the  visualization  factor  for  7 
percent.  The  remaining  4  percent  of  the  total  variance  is  accounted  for 
by  common  factors  on  which  the  loadings  are  quite  low.  Since  the  reli¬ 
ability  is  0.84,  the  test  contains  some  unknown  common-factor  variance. 

The  obtained  validity  of  —0.16  is  not  congruent  with  that  obtained 
for  a  similar  form,  CI505BX1  (sec- discussion  immediately  following). 
For  the  sample  on  which  this  validity  was  obtained,  the  standard  error 
of  the  biserial  correlation  is  0.09.  The  —0.16,  therefore,  is  not  signifi¬ 
cantly  different  from  zero. 

Since  the  time  required  for  the  entire  test  (85  minutes)  is  imprac¬ 
ticable  for  a  single  test,  it  is  considered  unnecessary  to  retain  two  parts. 
The  test  was  considered  worth  revising  because  of  the  fact  that  certain 
items  were  highly  related  to  total  score.  It  was,  therefore,  purified  by 
eliminating  items  with  negative  and  low  phis,  as  determined  in  the  sam¬ 
ple  of  179  bombardiers.  The  resulting  test  is  called  Map  Memory, 
CI505BX1,  which  is  described  next. 
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Map  Memory,  Q505BX1 2 3  4 5 6 

This  test  is  the  first  revision  of  Map  Memory,  CI505AXI.  The  only 
difference  between  the  two  tests  is  in  length;  length  was  reduced  by 
eliminating  items  that  did  not  discriminate  between  examinees  with  high 
and  low  total  scores  on  the  original  form. 

Description.  (1)  Internal  characteristics. — The  test  consists  of  three 
large  schematic  maps,  the  first  being  a  sample  map.  The  sample  map  is 
followed  by  5  sample  items,  and  each  of  the  two  test  maps  is  followed 
by  20  items.  A  sample  map  and  item  will  be  found  in  connection  with 
the  description  of  the  original  form  of  the  test. 

(2)  Administration. — Thirty-five  minutes  are  allowed  for  the  entire 
test.  The  sample  map  is  studied  for  2  minutes,  followed  by  2  minutes  for 
answering  the  five  sample  items.  Each  test  map  is  studied  for  4  minutes, 
and  8  minutes  are  allowed  for  answering  each  set  of  20  items. 

(3)  Scoring. — The  scoring  formula  is  R— W/4. 

Statistical  results.  (1)  Distribution  statistics. — The  test  was  adminis¬ 
tered  at  Psychological  Research  Unit  No.  3  in  September  and  November 
1942,  to  2,148  aviation  students  for  validation.  Distribution  constants 
for  each  set  of  20  questions  and  for  the  total  test  are  presented  in 
table  11.1. 


Table  11.1. —  Distribution  constants  for  Map  Memory,  C1505DX1,  based  upon  a 
sample  of  classified  pilots  (N—793)* 


(2)  Internal  consistency. — Thirty  of  the  forty  items  are  identical 
with  those  of  part  I  of  CI505AX1.  The  phis  for  these  30  items  range 
from  0.24  to  0.51,  with  a  mean  of  0.38. 

(3)  Reliability  Coefficients.— Map  I  items  were  correlated  with  map 
II  items,  yielding  a  coefficient  of  0.66,  corrected  for  length,  for  an  N  of 
500  unclassified  aviation  students. 

(4)  Difficulty. — Difficulty  values  for  the  30  items  taken  from 
Cl 505 AX l  range  from  0.37  to  0.92,  with  a  mean  of  0.61,  corrected  for 
chance. 

(5)  Factorial  composition. — The  most  significant  loadings  are  in  the 
visual-memory  (0.52),  paired-associates  memory  (0.41),  general-rea¬ 
soning  (0.23),  and  perceptual-speed  (0.22)  factors.  The  communality  is 
0.56.  For  a  full  picture  of  the  factorial  composition  of  this  test,  see 
appendix  B. 

(6)  Test  validity.— The  test  was  validated  against  graduation-elimina¬ 
tion  in  primary  pilot  school,  map  I  and  map  II  being  validated  sepa* 
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rately,  as  well  as  the  total  test.  Validation  results  arc  presented  in 

table  11 2. 

Tabu  11 2. —  Validity  data  for  Map  Memory,  CI505BX1,  based  on  the  criterion  of 
graduation- elimination  from  elementary  training* 


Pm 

SD, 

r«l 

I  . 

10.92 

9.03 

5.1* 

.10 

*ip  it . . 

1472 

12.02 

2.6* 

.22 

Tatal  . 

2S.42 

22.60 

7.S* 

.17 

1  N,  =  79J;  p(=.M;  cIuki  44B  mni  44 C. 


Evaluation. — By  reason  of  the  method  of  construction,  i.  c.,  selecting 
the  best  items  from  CI505AX1,  the  internal  consistency  of  this  test  is 
relatively  high.  Its  reliability,  however,  is  only  fair,  perhaps  because  of 
its  short  length  and/or  its  level  of  difficulty.  Its  total  validity  is  not 
high,  but  it  has  much  unique  valid  variance  to  offer,  since  both  memory 
factors  have  some  small  validity  for  the  pilot. 

Factor  analysis  of  this  tes*  shows  that  56  percent  of  the  total  variance 
is  accounted  for  by  common  factors,  leaving  a  small  amount  of  unknown 
variance.  Of  the  known  variance,  5  percent  is  accounted  for  by  the  per¬ 
ceptual-speed  factor,  5  percent  by  the  general- reasoning  factor,  27  per¬ 
cent  by  the  visual-memory  factor,  and  17  percent  by  the  paired-asso¬ 
ciates  memory  factor.  The  remaining  2  percent  is  accounted  for  by  fac¬ 
tors  on  which  the  loadings  are  quite  low.  It  is  not  a  pure  test,  but  its  two 
leading  factors  both  seem  to  be  weighted  in  the  pilot  criterion. 

An  estimated  validity  coefficient  (computed  from  factor  validities; 
see  table  28.17)  is  lower  than  that  found  empirically.  This  indicates  that 
there  is  common-factor  variance,  valid  for  pilot  training,  that  was  un¬ 
accounted  for  in  the  analysis.  This  is  probably  visualization,  which  did 
not  emerge  in  the  battery  in  which  the  BX1  form  was  analyzed,  but  did 
emerge  in  the  other  memory  battery,  in  which  the  AX1  form  appeared. 

Map  Memory,  Q505AX3  * 

This  test  is  similar  to  the  two  preceding  tests  and  was  designed  to 
measure  the  same  functions.  It  differs  from  Map  Memory,  CI505AX1 
and  CI505BX1,  in  that  the  period  of  study  of  the  maps  is  reduced  and 
fewer  questions  are  asked  on  each  map. 

Description.  (1)  Internal  characteristics. — The  test  consists  of  one 
sample  map  and  six  test  maps  of  the  same  type  used  in  the  preceding 
tests.  Four  items  follow  the  sample  map,  and  10  items  follow  each  of 
the  6  test  maps. 

(2)  Administration. — The  total  time  for  the  test  is  35  minutes.  Thirty 
seconds  are  given  for  the  study  of  the  maps,  including  the  sample  map; 
2  minutes  are  allowed  for  answering  the  4  sample  questions,  4  minutes 
for  the  first  10  items  and  3  minutes  for  each  of  the  5  remaining  sets  of 
10  items. 
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(3)  Scoring. — The  scoring  formula  is  R— W/4. 

Statistical  results. — The  data  given  below  are  for  examinees  tested 
at  Psychological  Research  Unit  No.  3  in  November  1942. 

(1)  Distribution  statistics. — A  sample  of  240  unclassified  aviation 
students  and  179  bombardiers  yielded  a  mean  score  of  23.2  and  a  stand¬ 
ard  deviation  of  8.6.  The  distribution  curves  were  approximately  normal. 

(2)  Interna l  consistency. — Phi  coefficients  ranged  from  0.09  to  0.50, 
with  a  mean  of  0.30  and  a  standard  deviation  of  0.10.  Upper  and  lower 
groups  of  100  each  (highest  27  percent  and  lowest  27  percent  of  the 
total)  were  used.  For  an  N  of  200,  phis  of  0.14  and  0.18  are  required 
for  significance  at  the  5  percent  and  1  percent  levels  respectively. 

(3)  Reliability  coefficients. — For  a  group  of  the  239  students  the  sum 
of  the  scores  in  the  first,  third,  and  fifth  groups  of  10  items  was  corre¬ 
lated  with  the  sum  of  the  scores  in  the  second,  fourth,  and  sixth  groups 
of  10  items,  yielding  a  coefficient  of  0.67  corrected  for  length. 

(4)  Difficulty. — Difficulty  indices  were  computed  for  the  same  sam¬ 
ple  used  to  determine  internal  consistency.  They  ranged  from  0.21  to 
0.92,  with  a  mean  of  0.54,  corrected  for  chance  success,  and  a  standard 
deviation  of  0.16. 

(5)  Factorial  composition. — The  most  significant  loadings  are  in  the 
visual-memory  (0.55),  verbal  (0.31),  visualization  (0.31)  and  spatial- 
relations  (0.21)  factors.  The  communality  is  0.61.  For  a  fuller  picture 
of  the  factorial  composition  of  this  test,  sec  appendix  B. 

(5)  Test  validity. — Validation  data  are  shown  in  table  11.3. 


Table  11.3  —  Validity  data  for  Map  Memory,  C1505AX3,  for  a  sample  of  pilots 
in  primary  training,  graduation-elimination  criterion  (<V»  —  176,  pt—M) 


Part 
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SD# 

I' 

11.71 

10.44 
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11* 

12.72 

12.10 
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Evaluation. — This  test  has  fair  internal  consistency,  and  its  item  diffi¬ 
culties  arc  on  the  whole  satisfactory.  Its  reliability  is  somewhat  low,  be¬ 
ing  the  same  as  that  of  CI505BXI  which  had  fewer  items  although 
requiring  the  same  amount  of  time.  The  pilot  validity  of  the  test  is  low, 
but  it  makes  a  unique  contribution. 

Factor  analysis  of  this  test  shows  that  common  factors  account  for 
61  percent  of  the  total  variance,  leaving  only  6  percent  of  the  nonerror 
variance  unknown.  Of  this,  the  verbal  and  visualization  factors  account 
for  10  percent  each,  the  spatial -relations  factor  for  4  percent,  and  the 
visual-memory  factor  for  30  percent.  The  remaining  7  percent  of  the 
total  variance  accounted  for  is  found  in  factors  on  which  the  loadings 
are  quite  low. 
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Estimation  of  the  pilot  validity  of  CIS05AX3,  by  means  of  facto? 
equations  (see  table  28.18),  yields  a  coefficient  similar  to  that  found 
empirically.  This  indicates  that  all  the  valid  factors  of  this  test  have 
been  accounted  for  by  the  analysis. 

Map  Memory  (Visual  Form),  C1S05AX2  * 

This  test  involves  pictorial  memory  for  complex  wholes  and  for  rela¬ 
tions  of  parts  with  both  stimulus  material  and  response  material  being 
pictorial.  It  was  designed  to  measure  visual  memory  for  map  details.  The 
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utilization  of  the  recognition  of  pictorial  response  material  rather  than 
verbal  questions  makes  the  task  more  like  that  actually  required  of  air¬ 
crew  members  in  flight 

Description.  (1)  Internal  characteristics. — The  test  consists  of  3  large 
diagrammatic  maps,  each  followed  by  20  items,  each  item  being  in  the 
form  of  5  small  map  sections,  one  of  which  is  an  accurate  reproduction 
of  a  section  of  the  large  map.  A  sample  problem,  consisting  of  a  large 
map  followed  by  an  item,  is  shown  in  figure  1 1-2. 

(2)  Administration. — The  instructions  direct  the  examinees,  when 
studying  the  large  map,  to  note  particularly  features  which  will  enable 
them  to  identify  any  section,  such  as: 

1.  Names  of  places  and  things. 

2.  Locations  of  places  and  things  in  relation  to  each  other. 

3.  Number  of  times  important  objects  appear  in  a  given  area. 

4.  Courses  followed  by  roads,  coastlines,  boundary  lines,  etc. 

The  total  time  limit  for  the  test  is  60  minutes.  Three  minutes  are  al¬ 
lowed  for  study  of  a  sample  map,  followed  by  3  minutes  for  answering 
the  three  sample  items.  Five  minutes  are  allowed  for  study  of  each  of  the 
3  test  maps,  followed  by  12  minutes  for  each  set  of  20  items. 

(3)  Scoring. — The  scoring  formula  is  R— W/4. 

Statistical  results. — Data  are  available  for  pilots  in  class  44D  tested 
at  Psychological  Research  Unit  No.  3  in  November  1942;  ami  for  samples 
of  navigators  tested  at  Sclman  Field  on  May  31  and  June  1, 1944,  at  El¬ 
lington  Field  on  May  22  and  23,  1944,  and  at  Psychological  Research 
Unit  No.  3  on  May  4,  5,  and  6, 1944. 

(1)  Distribution  statistics. — The  administration  of  the  test  to  689 
classified  pilots  (tested  while  unclassified)  yielded  a  mean  of  292  and 
a  standard  deviation  of  11.& 

(2)  Internal  consistency. — The  phi  coefficients  ranged  from  0.11  to 
0.65,  with  a  mean  of  0.42  and  a  standard  deviation  of  0.10.  For  an  N  of 
240  (highest  and  lowest  fourths  of  120  pilots  each  were  employed)  a 
phi  of  0.13  is  required  for  significance  at  the  5  percent  level  and  one  of 
0.17  for  significance  at  the  1  percent  level.  The  results  indicate  satisfac¬ 
tory  internal  consistency  for  the  test 

(3)  Reliability  coefficients.— The  three  groups  of  20  items  each  were 
treated  as  separate  parts  and  intcrcorrelatcd.  The  following  relbbility 
coefficients  (corrected  for  triple  length)  were  obtained  (N=487  pilots): 
Part  I  vs.  part  II,  0.83;  part  I  vs.  part  III,  0.79;  part  II  vs.  part  III, 

0.81. 

(4)  Difficulty. — Difficulty  indices,  computed  for  the  same  sample, 
ranged  from  0.29  to  0.85,  with  a  mean  of  0.59,  corrected  for  chance  suc¬ 
cess,  and  a  standard  deviation  of  0.13,  indicating  a  satisfactory  difficulty 
level  for  the  sample  studied  (N=487). 

(5)  Factorial  composition. — The  most  significant  loadings  are  in  the 
visual-memory  (0.58),  the  perceptual-speed  (  0.35),  and  the  verbal  (0.23) 
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factors.  The  communnlity  is  0.59.  For  a  fuller  picture  of  the  factorial 
composition  of  this  test,  see  appendix  B. 

(6)  Test  validity. — Validation  data  arc  shown  in  table  11.4. 


Taiw  11.4.-  Validity  data  for  Map  Memory,  C1S05AX2,  based  on  the  graduation- 
_  elimination  criterion 
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in  primary  training  ... 
l*itoi.  through  basic  training 

Niridlion  student* . 

Navigation  student* . 

Navigation  student* . 

R  —  \V/a 

R-W/4 

Riciits’ 

Wrongs* 

R-W/4 

OiteNKK 

0.91 

.86 

.91 

.91 

.91 

29.15 

28.90 

39.31 

15.48 

55.69 

27.50 

28.14 

3J.J7 

18.55 

30.94 

11.83 

11.66 

9.75 

8.45 

11.35 

0.08 

.04 

.20 

-.18 

.20 

>0.11 
•.10 
*  J4 
•—.31 
•34 

,  ''"UWinK  *n  unrcM  riot'd  staninc  standard  deviation  of  1 .13, 
A>«uminK  an  unrestricted  alanine  standard  deviation  of  2.00. 
ror  ibis  Mmple,  the  correlation  between  right*  and  wrongs  is  —.71. 


(7)  Item  validity . — Based  upon  671  pilots,  600  of  whom  graduated 
from  primary  training,  item-validity  phi  coefficients  ranged  from  —0.15 
to  0.20  with  a  mean  of  0.02  and  a  standard  deviation  of  0.04. 

for  an  N  of  671,  phis  of  approximately  0.08  and  0.10  are  required  for 
significance  at  the  5  percent  and  1  percent  levels  respectively. 

Evaluation. — The  estimated  reliability  of  this  test  is  satisfactory,  and 
its  correlations  with  other  tests  and  with  the  pilot  stanine  arc  relatively 
low.  It  would  contribute  a  very  small  unique  pilot  validity  by  virtue  of 
its  loading  with  the  visual-memory  factor,  but  at  too  great  a  cost  in 
testing  time.  The  average  pilot  validity  is  0.17  (including  data  from  one 
sample  of  92  not  mentioned  before),  which  is  almost  all  accounted  for 
by  known  factors.  The  navigator  validity  coefficient  is  fairly  high,  which 
suggests  that  further  exploration  of  memory  tests  for  this  air-crew  posi¬ 
tion,  particilarly  those  saturated  with  the  visual-memory  factor,  would 
be  worth  while. 

Factor  analysts  of  this  test  has  accounted  for  59  percent  of  its  total 
variance  (compared  with  a  reliability  of  about  0.80).  Of  this  the  verbal 
factor  accounts  for  5  percent,  the  perceptual-speed  f .actor  for  12  per¬ 
cent,  and  the  visual-memory  factor  for  34  percent  of  the  total  variance. 
The  remaining  8  percent  is  accounted  for  by  factors  on  which  the  load¬ 
ings  are  quite  low.  Map  Memory,  CI505AX2,  is  the  purest  test  in  this 
series,  measuring  the  visual-memory  factor  fairly  well.  As  such,  it  has 
value  in  factor-analysis  research. 

Visual  Memory,  G514A  * 

This  is  a  nonverbal  memory  test.  It  was  designed  to  measure  visual 
memory  for  parts  of  a  complex  whole.  It  was  believed  tliat  the  test  would 
stress  visual-memory  ability  to  a  greater  degree  than  most  forms  of  Map 
Memory,  0505.  As  in  Map  Memory,  C1505AX2,  response  is  made  to 
a  pictorial  stimulus  raihcr  than  to  a  verbal  stimulus. 

*  Developed  at  Psjrtholofie*!  Research  Unit  N«.  1.  Owl  contributor*:  S/Sft.  Arthur  L 
C«rf,  S«t.  il|u*  HcBer,  Me.  Charles  W.  NtUon. 
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FIGURE  11.3 

SAMPLE  PHOTOGRAPH  &  FOUR  TEST  ITEMS  OF  VISUAL 
MEMORY,  CI5WA 


Description.  (1)  Internal  characteristics. — The  test  consists  of  five, 
page-size,  aerial  photographs  (study-photographs),  each  with  24  small 
test-photographs.  The  examinee  studies  the  large  photograph,  turns  the 
page,  then  indicates  which  of  the  24  small  photographs  are  portions  of 
the  large  one  and  which  are  not.  A  sample  problem,  consisting  of  a 
study-photograph  and  four  test-photograph is  shown  in  figure  11.3. 

(2)  Administration. — The  examinee  F  informed  that  the  test  is  a 
measure  of  his  ability  to  remember  aerial  photographs.  One  minute  is 
allowed  for  studying  each  large  map  and  2  minutes  for  answering  the 
24  items.  Examinees  arc  told  to  follow  their  '‘hunches''  in  answering  the 
items. 

(3  Scoring. — The  scoring  formula  is  R— W+20. 

Statistical  results. — Data  arc  available  for  unclassified  aviation  stu¬ 
dents  tested  in  April  1943  at  Psychological  Research  Unit  No.  3. 

(1)  Distribution  statistics. — A  sample  of  298  unclassified  aviation 
students  yielded  a  mean  score  of  64.2  and  a  standard  deviation  of  14.8. 
The  distribution  curve  is  approximately  symmetrical  and  somewhat  more 
peaked  than  normal. 

(2)  Internal  consistency. — The  degree  of  homogeneity  of  the  items  is 
indicated  by  a  mean  internal-consistency  phi  of  0.44,  a  standard  deviation 
of  the  phi  distribution  of  0.28,  and  a  range  of  values  from  0.00  to  0.98. 
These  statistics  are  based  upon  analysis  of  the  responses  of  the  highest 
27  percent  and  the  lowest  27  percent  in  total  score  of  a  group  of  750 
unclassified  aviation  students. 

(3)  Reliability  coefficient. — Preliminary  evidence  on  reliability  was  ob¬ 
tained  by  the  Kuder-Richardson  method.  An  estimated  reliability  coeffi¬ 
cient  of  0.87  was  obtained.  This  figure  is  based  on  a  sample  of  624  un¬ 
classified  aviation  students. 

Evaluation. — The  similarity  between  Visual  Memory  and  the  Map 
Memory  forms,  both  in  subject  nutter  and  presentation,  leads  to  the  be¬ 
lief  that  the  test  probably  has  approximately  the  same  validity  as  those 
forms.  This  test  is  well  constructed,  and  it  would  seem  probable  that  it 
will  prove  a  satisfactory  instrument  for  measuring  a  type  of  visual 
memory. 

Plane  Position  Memory,  CI512A  1 

This  test  is  also  a  nonverbal,  visual-memory  test.  The  test  was  de¬ 
signed  to  measure  ability  to  remember  parts  of  a  complex  whole,  stress¬ 
ing  memory  for  jiositions  of  objects. 

Description.  (1)  Internal  characteristics. — On  each  of  four  study- 
pages  of  the  test  are  presented  nine  airplanes  in  three  rows  of  three.  1  he 
airplanes  are  of  different  lyjics  and  are  headed  in  one  of  four  differ* nt 
directions  (up.  down.  left,  ri^ht)  but  all  are  shown  from  a  side  view. 
Following  the  study  |»agc.  the  nine  airplanes  arc  shown  in  different  pnsi- 
“ *  V*.l  NV  >.  CVcf  mIhWm:  AlWrt  A. 

Canfield.  Jr. 
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tions  on  another  page  and  all  arc  headed  toward  the  left.  Figure  11.4 
shows  a  part  of  a  study-page  (in  the  upper  pane!)  and  the  succeeding 
response-page  (in  the  lower  panel).  There  is  one  practice  problem  at  the 
beginning  of  the  test. 

(2)  Administration. — The  examinees  are  informed  that  the  test  is  a 
measure  of  their  ability  to  remember  the  positions  of  planes,  and  that  the 
task  It  to  remember  in  what  row  the  airplanes  appear  and  in  what  direc¬ 
tion  they  are  headed.  The  examinees  are  given  2  minutes  to  study  the 
planes;  then,  at  a  signal,  the  examinee  turns  io  the  response-page.  The 
examinee  is  allowed  3  minutes  to  indicate,  by  marking  A,  B,  or  C,  in 
what  row  the  airplane  appeared,  and,  by  marking  A,  B,  C,  or  D  in  the 
next-numbered  space  on  the  answer  sheet,  in  which  direction  it  was 
going.  A  box  at  the  top  of  each  of  the  response  pages  indicating,  by 
means  of  arrows,  the  symbol  for  each  direction,  facilitates  the  examinee 


POSITION  MEMORY,  C15I2A 
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in  marking  answers.  The  total  time  for  the  test  is  30  minutes  with  an 
actual  testing  time  of  20  minutes. 

(3)  Scoring. — The  scoring  formula  is  R— W/3. 

Statistical  results. — No  statistical  data  are  yet  available  for  this  test. 

Evaluation. — This  test  was  designed  to  measure  visual  memory  as 
purely  as  possible.  Subjective  examination  of  the  test  indicates  that  it 
should  be  fairly  good  for  this  function.  It  is  a  well-designed  and  exe¬ 
cuted  test  that,  for  use  on  an  aviation-student  population,  has  face 
validity. 

Airplane  Formation  Memory,  CI513A  *  1 

This  is  another  nonverbal,  visual-memory  test,  developed  for  analyti¬ 
cal  purposes.  The  test  differs  from  Visual  Memory,  CI514A,  not  only  in 


FIGURE  it. 5 
STUDY  PAGE  AND 
SUCCEEDING  RESPONSE 
PAGE  OF  AIRPLANE 
FORMATION  MEMORY, 
CI5I3A 


4T  4T 


•Developed  at  Psychological  Research  Unit  No.  3.  Chief  contributor#:  Cul.  LUnd  D.  Brokaw, 
^Vcncc  k.  CiroMman,  L%  John  L  Lacey. 
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subject  content,  but  in  that  it  measures  ability  to  remember  complex 
wholes. 

Description.  (1)  Internal  characteristics. — The  test  consists  of  no. 
parts  of  20  items  each,  with  two  sample  items.  The  stimuli  arc  diagram¬ 
matic  planes  in  formation.  The  sha|>e  of  the  formation  and  the  nutuW-r 
of  airplanes  in  it  (4  to  10)  varies  from  item  to  item.  The  shaj>e  of  the 
planes  and  the  view  (top  or  side  view)  are  constant  within  each  item, 
though  they  vary  among  items.  The  response  is  made  to  another  forma¬ 
tion  of  planes  similar  to  the  stimulus-formation  except  that  some  planes 
have  been  moved  out  of  position  within  it.  The  task  of  the  examinee  i» 
to  select  those  planes  that  have  been  moved  out  of  position.  A  sample 
item,  consisting  of  a  stimulus-formation  ami  the  formation  from  which 
the  response  is  made,  is  given  in  figure  11.5. 

The  examinees  arc  given  a  brief  time  to  study  the  formation  (8  sec¬ 
onds)  and  then  are  told  to  turn  to  a  given  page  in  the  back  of  test  book¬ 
let  where  the  response- formation  appears.  The  response  items  are  scat¬ 
tered  randomly  throughout  the  last  half  of  the  booklet.  This  was  done 
in  order  to  reduce  the  possibility,  of  the  cxSJniney’s  getting  answers  by 
seeing  the  final  positions  of  the  succ^jfling  problems, "since  there  were 
Iw'o  response- formations  to  a  page.  Tims,  a  few  seconds  elapse  between 
the  time  the  examinee  has  seen  the  original  formation  and  has  locnteil 
the  response-formation. 

(2)  Administration. — The  examinees  are  told -that  ibis  is  a  test  of 
their  ability  to  remember  positions  of  airplanes  in  formation.  After  8 
seconds  study  for  each  formation,  they  are  given  the  page  and  number  of 
the  response-formation  (which  is  also  printed  on  the  study  page),  and 
allowed  15  seconds  to  locate,  select,  and  blacken  answers  on  the  answer 
sheet.  The  total  testing  time  is  approximately  22  minutes,  including  the 
administration  of  the  directions  which  take  approximately  5  minutes. 

(3)  Scoring — The  scoring  formula  is  R— W. 

Statistical  results. — There  arc  no  statistics  available  for  this  test. 

Evaluation. — Since  there  are  no  statistical  data'  available  for  this  lest, 
it  is  difficult  to  make  any  real  evaluation  of  it.  The  test  was  developed 
as  jiart  of  the  plan  to  construct  pure  factor  tests.  It  is  not  known 
whether  the  visual-memory  factor  involves  correct  recall  or  recognition 
of  details,  or  jiosilions,  or  of  forms  or  any  combination  of  these.  Hence 
a  variety  of  such  tests  were  developed  for  analytical  study  and  for 
validation. 

Memory  for  Plane  Silhouettes,  CI503AX1  ,u 

This  test  is  a  nonverbal,  paired-associates,  immediate-memory  lest  with 
a  recognition  response  made  to  a  pictorial  stimulus.  It  was  designed  to 
measure  ability  to  remember  the  relationships  between  paired  wholes. 

Description.  (1)  Internal  characteristics. — Silhouettes  of  planes  are 
presented  on  a  page  which  the  examinees  study,  the  top  and  side-view 

*•  Developed  at  Psychological  Research  Unit  No.  3.  Chief  contributor:  Lt.  Frank  J.  Dudek. 
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silhouettes  of  each  plane  being  shown  paired.  After  a  brief  study  period, 
the  top  silhouettes  are  presented  in  a  column  on  a  test  page.  In  a  parallel 
column,  side-view  silhouettes  of  the'  planes  are  shown  on  this  page,  but 
not  paired  with  the  top  silhouettes  as  on  the  study  page.  A  brief  match¬ 
ing  test  is  thus  presented.  More  side-view  silhouettes  are  presented  in 
the  recognition  group  than  there  are  pairs  on  the  study  page.  This  is 
done  in  order  to  lower  the  dependence  of  one  item  on  another,  and  thus 
prevent  some  right  responses  merely  by  a  process  of  elimination.  The 
task  of  the  examinee  is  to  pair  up  the  silhouettes  in  the  same  way  they 
were  presented  on  the  study  page.  There  arc  28  items  involving  5  study 
pages,  each  with  4  to  8  pairs.  Since  the  planes  are  different  in  each 
group,  each  section  is  independent  of  the  others.  There  is  a  sample 
item  at  the  beginning  of  the  test.  A  sample  item,  consisting  of  two  sets 
of  planes,  and  three  side-view  silhouettes  to  which  the  responses  are 
made,  is  presented  in  figure  11.6. 


9 


(2)  Administration. — Eighty  seconds  are  given  for  each  study  pe¬ 
riod.  At  the  end  of  that  period,  a  signal  is  given  for  the  examinee  to 
turn  to  the  response  page.  Two  minutes  are  allowed  to  match  the  planes 
and  mark  the  answers.  At  the  end  of  that  period,  another  signal  is  given 
to  turn  to  the  succeeding  study  page. 

(3)  Scoring. — The  scoring  formula  is  R— W/5. 

Statistical  results.— Except  where  noted  below,  the  following  data  are 
for  unclassified  aviation  students  tested  in  October  1942  at  Psychological 
Research  Unit  No.  3. 

(1)  Distribution  statistics. — A  sample  of  238  unclassified  aviation 
students  yielded  a  mean  score  of  21.7  and  a  standard  deviation  of  5.0. 
The  distribution  curve  is  approximately  symmetrical  and  somewhat  flat¬ 
ter  than  normal. 

(2)  Internal  consistency. — The  degree  of  homogeneity  of  the  items 

is  indicated  by  a  mean  internal-consistency  phi  of  0.48,  a  standard  devi¬ 
ation  of  the  phi  distribution  of  0.10,  and  a  range  of  values  from  0.24  to 
0.66.  These  statistics  arc  based  upon  analysis  of  the  responses  of  the 
highest  27  percent  and  the  lowest  27  percent  in  total  score  of  a  group  of 
750  unclassified  aviation  students.  • 

(3)  Reliability  coefficient. — By  the  alternate-forms  method  (part  I- 
part  II),  an  estimated  reliability  coefficient  of  0.82,  corrected  for  length, 
was  obtained.  This  figure  is  based  on  a  sample  of  238  unclassified  avia¬ 
tion  students. 

(4)  Difficulty. — Based  upon  item  analysis  of  the  respon.  es  of  750 
unclassified  aviation  students,  the  test  yielded  a  mean  proportion  of  cor¬ 
rect  responses  of  0.68,  corrected  for  chance,  with  a  range  from  0.33  to 
0.90  and  a  standard  deviation  of  0.14. 

(5)  Factorial  composition. — The  most  significant  loadings  are  in 
paired-associates  memory  (0.56),  spatial-relations  (0.38),  and  perceptual- 
speed  (0.34)  factors.  The  communality  is  0.61.  For  a  fuller  picture  of 
the  factorial  composition  of  this  test,  see  appendix  B. 

(6)  Test  validity. — Validation  results  based  on  several  samples  given 
are  in  table  11.5. 


Tabus  11.5. —  Validity  data  for  Memory  for  Plane  Silhouettes,  CI503AX1,  based 
ufon  samples  of  pilots  in  primary  training,  graduation-elimination  criterion 


N, 

M. 

SO, 

flll 

‘MM 

0.82 

19.96 

18.14 

5.52 

0.1« 

‘169 

86 

22.64 

19.76 

5.04 

.31 

•2JJ 

.70 

19.64 

17.90 

5.76 

.18 

•Toted  in  January  1943  at  Psychological  Research  Unit  No.  J. 
•Tested  in  November  1942  at  raychologicid  Research  Unit  No.  3. 
•Tested  in  October  1942  at  1'aychological  Research  Unit  No.  I. 


Evaluation. — Because  this  test  has  a  fairly  high  correlation  with  the 
pilot  stanine,  it  would  apparently  add  very  little  to  the  classification  bat¬ 
tery  in  prediction  of  pilot  success.  This  is  due  to  the  fact  that  the  test 
has  some  loadings  in  perceptual  speed  (average  for  two  analyses  is 
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0.34),  and  spatial  relations  (average  is  0.38),  lioth  being  valid  factors 
for  pilot  prediction  and  already  heavily  weighted  in  the  pilot  stanine.  Its 
loading  with  paired-associates  memory  is  a  unique  contribution,  but  it 
is  not  very  heavily  weighted  in  the  pilot  criterion. 

The  pilot  validity  of  this  test  (weighted  average  of  0.21)  is  almost 
fully  accounted  for  by  its  common  factors.  This  test  would  !>c  suitable 
to  measure  the  paired-associates  memory  factor,  except  for  the  fact  that 
it  has  significant  foreign  variance  in  the  perceptual  and  spatial-relations 
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factors.  Memory  for  Landmarks  is  a  more  pure  and  more  heavily 
weighted  test  of  the  paired-associates  memory  ability  and  so  is  more 
suitable  to  represent  it. 

The  known  factor  content  of  this  test  accounts  for  only  61  percent  of 
its  variance,  compared  to  a  reliability  of  0.82. 

Memory  for  Landmarks,  C1510AX1  11 

This  is  a  visual-memory,  paired-associates  test  in  which  a  pictorial 
symbol  is  paired  with  a  verbal  symbol.  In  the  recognition  test  following 
the  study  of  these  pairs,  a  long-matching-form  arrangement  is  used. 

Description.  (1)  Internal  characteristics . — The  test  consists  of  2 
•  parts  of  20  items  each.  Each  part  is  in  a  separate  booklet  and  is  divided 
into  two  sections.  For  each  section  there  is  a  study  page  on  which  15 
diagrams  of  similar  topographical  features,  i.  c.,  lakes,  rivers,  bays,  are 
paired  with  their  names.  All  the  diagrams  on  3ny  one  page  are  of  the 
same  type  of  geographical  feature,  but  the  feature  varies  from  page  to 
page.  After  a  brief  study  period,  the  examinee  turns  the  page  to  the 
response  material.  This  consists  of  10  diagrams,  identical  with  10  of  the 
15  on  the  study  page.  Alongside  these  diagrams  is  presented  the  original 
list  of  15  names,  in  mixed  order.  The  task  of  the  examinee  is  to  match 
the  names  with  the  diagrams.  There  is  a  short  practice  problem  at  the 
beginning  of  the  test.  A  sample  of  the  diagrams  on  the  study-pages  and 
response-pages  is  given  in  figure  11.7. 

(2)  Administration. — The  examinees  are  informed  that  this  is  a  test 
to  measure  their  ability  to  remember  geographical  landmarks.  Four  min¬ 
utes  are  allowed  for  the  study  of  each  set  of  landmarks ;  then,  at  a  sig¬ 
nal,  the  examinees  are  told  to  turn  the  page  and  match  the  names  with 
the  landmarks.  Three  minutes  are  allowed  for  selection  and  marking  of 
the  answers.  Strong  cautionary  statements  are  made  prohibiting  exami¬ 
nees  turning  back  to  the  study  page  after  the  study  period.  The  total 
testing  time  for  each  booklet  is  approximately  20  minutes. 

(3)  Scoring. — The  scoring  formula  is  the  number  right  only. 

Statistical  results. — The  data  given  below  are  for  examinees  tested  in 

October  1942  at  Psychological  Research  Unit  No.  3. 

(1)  Distribution  statistics. — Typical  examples  of  distribution  statis¬ 
tics  obtained  on  this  test  are  given  in  table  11.6.  The  distribution  curves 
are  approximately  symmetrical  and  somewhat  more  peaked  than  normal. 

Tabue  11.6. —  Distribution  constants  for  Memory  for  Landmarks,  CIS10AX,  based 
upon  samples  of  unclassified  aviation  students 


Part 

N 

If 

SD 

Part  I  . 

166 

M 

J.l 

Part  II  . 

166 

a.9 

J.« 

Total  . 

161 

16.0 

M 

Total  .  . 

ill 

16.0 

M 

« Developed  at  Pircholorcal  Rcacarth  Unit  Na.  J.  Cfclef  contributor,:  T/Sgt  Paul  C 
Da  via,  U.  D»»id  If.  Jcnkina. 
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(2)  Internal  consistency. — Analysis  of  responses  of  several  sample 
groups  yielded  the  internal-consistency  data  given  in  table  11.7. 

Table  11.7. —  Internal-consistency  data  for  Memory  for  Landmarks,  CI510AX 
based  upon  samples  of  unclassified  aviation  students 


(3)  Reliability  coefficient. — By  the  alternate- forms  method  (part  I- 
part  II),  an  estimated  reliability  coeflicient  of  0.82,  corrected  for  length, 
was  obtained.  This  figure  is  based  on  a  sample  of  238  unclassified  avia¬ 
tion  students. 

(4)  Difficulty. — Based  upon  item  analysis  of  the  responses  of  179 
unclassified  aviation  students,  indices  of  difficulty  were  found  as  shown 
in  table  11.8. 

Table  11.8. — Difficulty  indices  for  Memory  for  Landmarks,  CI510AX,  based  upon 
179  unclassified  aviation  students 


(5)  Factorial  composition.— The  most  significant  loadings  are  in  the 
paircd-associatcs-mcmory  (0.61)  and  visual-memory  (0.20)  factors,  and 
in  a  third  memory  factor  (0.44),  which  seems  to  be  confined  to  this 
test  and  Plane  Name  Memoiy,  CI506AX1.  The  communality  is  0.68. 
For  a  fuller  description  of  the  factorial  composition  of  this  test,  see 
appendix  B. 

(6)  Test  validity. — Validation  results  are  given  in  table  11.9. 

Table  11.9.—  Vc'idity  data  for  Memory  for  Landmarks,  CI510AXI,  based  on 


Evaluation. — This  test  is  not  markedly  different  from  others  of  the 
paired-associates  type,  such  as  Plane  Name  Memory  and  Memory  for 
Ships,  which  have  correlation  coefficients  with  Memory  for  Landmarks  of 
0.69  and  0.51  respectively.  Both  of  these  other  tests  yield  slightly  higher 
validity  coefficients  with  pilot  training. 
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Factor  analysis  of  this  test  shows  that  68  percent  of  the  total  variance 
is  recounted  for  by  common  factors.  Of  this.  4  percent  is  attributed  to 
the  visual  itirmniy  factor.  47  jH-rcent  to  the  paircd-nsstH’iatesmiemorv 
factor,  and  10  percent  to  a  third  memory  factor  that  seems  to  he  confined 
to  this  test  and  Plane  Xante  Memory.  The  remaining  8  percent  of  the  total 
variance  is  accounted  for  by  factors  on  which  the  loadings  are  quite  low. 

Since  an  estimate  of  the  pilot  validity,  made  from  factorial  equations 
(sec  table  28.18),  is  similar  to  that  found  empirically,  the  inference  is 
that  all  factors  valid  for  pilot  selection  have  been  accounted  for  in  the 
analysis.  Considerable  nonerror  variance,  however,  is  still  to  lie  defined 
in  this  test. 


LYNX 


RENaLDO 


B>  VANGUARD 
C-  SWIFT 
0-  RELIANCE 
E-  STORMER 


A-  SPADER 
B-  SPIVALOI 
C-  LYNX 
0-  PUMA . 

E-  MERCURY 


FIGURE  II. A 

SAMPLE  STUDY  PAGE  i  RESPONSE  ITEMS  OF  RLANC 
NAME  MEMORY,  C1500AX2 
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This  tost  is  the  purest  measure  developed  of  the  paired-associates* 
memory  factor,  which  accounts  for  37  percent  of  its  common  variance. 
Both  this  factor  and  memory  Iff  have  sonic  pilot  validity.  Though  it 
does  not  equal  some  of  the  other  jwired-associates  memory  tests  for  pre¬ 
dicting  pilot  success,  it  does  have  something  unique  to  offer  and  does 
have  value  in  factor-analysis  research. 

Plane  Name  Memory,  CI506AX2  11 

This  is  a  visual-memory,  paired-associates  test  in  which  plane  sil¬ 
houettes  are  paired  with  their  names  so  that  later  presentation  of  the 
pictorial  stimulus  is  to  call  forth  the  verlial  associate. 

Description.  (1)  Internal  characteristics. — The  test  includes  40  items 
which  form  two  parts.  In  part  I,  20  planes  arc  shown  in  front-view  sil¬ 
houettes;  in  part  II,  20  planes  are  shown  in  side-view  silhouettes.  The 
name  of  each  plane  appears  below  the  silhouette.  After  a  study  period, 
the  examinees  turn  to  a  page  on  which  the  same  planes  arc  arranged  in 
different  order  with  five  names  under  each  plane.  The  examinees  are 
told  to  select  the  correct  name  of  each  plane.  Samples  of  the  stimulus 
and  response  items  are  given  in  the  upper  and  lower  portions  respec¬ 
tively  of  figure  11.8. 

(2)  Administration. — The  examinees  are  informed  that  the  test  is  a  . 
measure  of  their  ability  to  learn  the  names  of  planes.  Four  minutes  are 
allowed  for  each  study  period.  Six  minutes  are  given  in  each  part  for 
the  selection  and  marking  of  answers.  The  approximate  total  time  of 
testing  is  25  minutes. 

(3)  Scoring. — The  scoring  formula  is  R  —  W/4. 

Statistical  results.  (1)  Distribution  statistics. — Typical  examples  of 
distribution  statistics  obtained  on  this  test  are  given  in  table  11.10.  The 
distribution  curves  arc  approximately  normal. 


Tabi  c  11.10. —  Distribution  constants  for  Plane  Name  Memory,  C1506AX 


Group 

N 

M 

SD 

Unclassified  avialion  students*  . 

231 

20.1 

>.l 

Classified  pilots*  . 

SOS 

23.1 

•  1 

Classified  pilots* . . . 

743 

2*.» 

fi-7 

■Toted  at  Psychological  Kn«rcli  I'n.i  No.  3.  in  October  Wl. 

*  In  das*  43K.  Tested  at  Psychological  Research  Unit  No.  J  in  January  and  February  1*0. 
1  In  dau  44l).  Tested  at  Psychological  Research  Unit  No.  3  in  Sr|Krnlcr  1941. 


(2)  Internal  consistency. — The  degree  of  homogeneity  of  the  test  is 
indicated  by  a  mean  internal-consistency  phi  of  0.41,  a  standard  devia¬ 
tion  of  the  phi  distribution  of  0.10,  and  a  range  of  values  from  0.20  to 
0.65.  These  statistics  are  based  ujion  the  highest  27  percent  and  the  low¬ 
est  27  percent  in  total  score  of  a  group  of  750  unclassified  aviation  stu¬ 
dents  tested  at  Psychological  Research  Unit  No.  3. 

(3)  Reliability  coefficient. — By  the  alternate- forms  method  (part  Z- 

u  Derr  loped  «t  Psychologies!  Research  Unit  No.  J.  Chief  contributor:  La.  hfahloa  I.  Snllh. 
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part  II),  an  estimated  reliability  coefficient  of  0.82,  corrected  for  length, 
was  obtained.  This  figure  is  based  on  a  sample  of  238  unclassified  avia¬ 
tion  students  tested  at  Psychological  Research  Unit  No.  3  in  October 

1942. 

(4)  Difficulty. — Based  upon  item  analysis  of  the  responses  of  750  un¬ 
classified  aviation  students  tested  at  Psychological  Research  Unit  No.  3, 
the  test  yielded  a  mean  proportion  of  correct  responses  of  0.57,  cor¬ 
rected  for  chance,  with  a  range  from  0.30  to  0.86,  and  a  standard  devi¬ 
ation  of  0.16, 

(5)  Factorial  composition. — The  most  significant  loadings  are  in  the 
perceptual-speed  (0.29),  the  paircd-associntcs-incmory  (0.58),  and  the 
third  memory  (0.51)  factor.  The  communality  is  0.71.  For  a  fuller  pic¬ 
ture  of  the  factorial  composition  of  this  test,  see  appendix  B. 

(6)  Test  validity. — Validation  results  based  on  several  samples  are 
given  in  table  11.11. 


Table  11.11. —  Validity  data  for  Plant  Namt  Memory,  CI506AXJ,  using  the 
graduation-elimination  criterion 


Crmf» 


imtry 


Pilots  in  prl 
training1  . 

Pilots  in 
traini 

Pilots  in  primary 

training*  .... 

Pilots  in  primary 

training*. . 

Pilots  in  primary 

training* . . 

Navigation  students' 

«  r 

••  I 


CU»J 

Sc.r. 

N, 

M. 

SD, 

r»i. 

410 

R-W/4 

222 

0.74 

22.12 

10.79 

116 

0.24 

411 

R-W/4 

170 

.06 

21.06 

19.30 

119 

.11 

»«*• 

41J 

*-w/4 

111 

J4 

24.62 

17.70 

0.60 

.47 

•  •  •  • 

4JK 

R-W/4 

SOS 

.9* 

23.02 

20.31 

R2t 

.24 

•  •  •  • 

44D 

R-W/4 

743 

.97 

29.16 

27.5* 

6.74 

.15 

0.10 

»  T  .  - 

Rigkti* 

U'l 

.92 

11.25 

29.97 

6.24 

.10 

.20 

.... 

Wrong** 

1.612 

.92 

7.35 

0.61 

J.50 

-.11 

-.20 

.... 

R-W/4* 

1.652 

.92 

29.41 

27*2 

7.42 

.11 

.21 

1  Assuming  an  unrestricted  Mining  standard  deviation  of  ZOO. 

•Tested  October  2  and  ).  UU,  at  Psychological  Research  Unit  No.  0. 

•Te  trd  November  H  and  1 7.  1912.  at  Psychological  Research  Unit  Nat  X 

•  Testing  dales  and  locale  not  available. 

•Tested  January  20.  February  I  and  9,  190,  at  P.vchological  Research  Unit  No.  L 
•Tested  September  2  and  J,  190,  ,i  P.ych logical  Research  Unit  No.  3. 

'  Tested  May  Jl  and  June  1.  190,  at  Paychological  Research  Unit  No.  I;  Aped  17  through  21, 
190.  at  Psychological  Research  Unit  No.  2;  and  April  10,  11,  and  12,  190,  at  Psychological 
Research  Unit  No.  I. 

•  For  this  sample,  the  correlation  between  rights  and  wrongs  la  -.1). 


Evaluation. — Plane  Name  Memory,  another  of  the  paired-associates 
type  of  test,  shows  relatively  moderate  pilot  and  navigator  validity  (ap¬ 
proximately  0.22  and  0.21,  respectively).  Factor  analysis  of  this  test 
shows  that  71  percent  of  the  total  variance  is  accounted  for  by  common 
factors,  cc  npared  with  a  reliability  of  0.82.  Of  this,  the  pcrccptual-spccd 
factor  accounts  for  8  percent,  the  paircd-associatcs-memory  factor  for 
34  percent,  and  a  third  memory  factor  for  26  percent.  This  latter  factor 
seems  to  be  restricted  to  this  test  and  Memory  for  Landmarks  and  will  be 
discussed  later  in  the  chapter.  The  remaining  3  percent  of  the  known  vari¬ 
ance  is  accounted  for  by  factors  on  which  the  loadings  are  quite  low. 

An  estimation  of  the  pilot  validity  coefficient  of  this  test,  made  from 
factors  (see  table  28.17)  for  which  the  pilot  validity  is  known,  accounts 
for  approximately  0.16  of  the  validity  coefficient  of  0.22.  The  difference 
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may  be  due  to  variance  in  visualization,  which  was  lacking  in  the  particu¬ 
lar  analysis  in  which  this  test  appeared. 

Memory  ifor  Ships,  Cl  504  AX  11 

This  is  another  visual-memory  test  in  which  a  pictorial  symbol  and  a 
verbal  symbol  are  paired  so  that  the  pictorial  stimulus  is  to  stimulate 
recall  of  the  verbal  associate. 

Description  (1)  Internal  characteristics. — The  test  consists  of  three 
study  pages  and  three  response  pages.  On  each  study  page  arc  10  ships 
paired  with  their  respective  nationalities.  Succeeding  the  study  page  is  a 
response  page  on  which  12  ships  arc  presented,  without  their  nationali¬ 
ties  indicated,  10  of  which  are  the  same  as  on  the  previous  page,  and  2 
that  arc  not.  The  ships  arc  shown  from  an  oblique  aerial  view  and  all  are 
headed  in  the  same  direction.. 

The  task  of  the  examinee  is  to  determine  the  nationality  of  each  ship 
as  shown  on  the  study  page,  or,  if  it  did  not  appear  on  the  study  page 
to  indicate  that  it  was  not  shown.  At  the  top  of  each  response  page  is 
placed  a  letter  symbol  for  each  nationality  which  is  used  in  marking  the 
answers.  Figure  11.9  shows  a  portion  of  both  the  stimulus  and  the  re¬ 
sponse  pages. 

(2)  Administration. — The  examinees  are  informed  that  the  test  is  to 
see  how  well  they  can  remember  ships  and  their  nationalities.  Two  min¬ 
utes  are  allowed  to  study  the  ships ;  then,  at  a  signal,  the  page  is  turned 
and  3  minutes  are  allowed  for  answering  the  problems.  The  total  testing 
time  is  approximately  30  minutes. 

(3)  Scoring. — The  scoring  formula  is  R—W/5. 

Statistical  results. — All  the  data  following  are  for  examinees  at  Psy¬ 
chological  Research  Unit  No.  3. 

(1)  Distribution  statistics. — A  sample  of  238  unclassified  aviation 
students  tested  in  November  1942  yielded  a  mean  score  of  15.9  and  a 
standard  deviation  of  6.4.  The  distribution  curve  is  approximately  sym¬ 
metrical  and  flatter  than  normal. 

(2)  Reliability  coefficient. — Correlations  among  the  three  parts  of  this 
test  yielded  the  estimates  of  reliability  given  in  table  11.12. 

Tabu  1 112.—  Estimated  alternate-forms  reliability  coefi'ientt  for  Memory  foe 


Shift,  Cl  WAX  based  it  fen  a  samfte  of  2JS  undaisified  aviation  students 


ParU 

•** 

** 

Part  1  will  Part  If . 

Pan  I  wilfc  Pari  lit  . 

.11 

Part  11  with  Tart  til  . 

.41 

J « 

‘Tfilrd  in  November  1941. 

'CrrmlH  for  IripW  knfilL 

(3)  Factorial  composition. — The  most  significant  loadings  are  in  the 
paircd-associatvs-nu-nsory  (050),  spatial-relations  (0.31),  and  the  per- 

“  DfTflir-  J  »l  PinWI«*wil  Vmi  N».  J.  CVrf  U.  L»««  HhhUmm. 

Li  DtTld  11.  JnkiM. 
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ceptual-spccd  (0.29)  factors.  The  communality  is  0.51.  For  a  fuller  pic¬ 
ture  of  the  factorial  composition  of  this  test,  see  appendix  B. 

(4)  Test  validity. — Validation  results  based  on  two  samples  are  given 
in  table  11.13. 


Table  11.13. —  Validity  data  for  Memory  for  Ships,  CI504AX,  based  upon  the 
graduation-elimination  criterion  in  primary  pilot  training 


N, 

M, 

M. 

SD, 

'»i. 

*6S8 

0.73 

16.14 

14.08 

6.30 

0.20 

*S86 

.79 

17.71 

IS.88 

6.27 

.17 

•  In  class  431.  Tested  November  9,  10,  tl  and  17,  1942. 

*  Eighty-nine  cases  in  class  431,  1S4  cases  in  class  43J,  and  343  cases  in  class  43K.  Tested 
November  14,  1942,  January  20,  1943,  and  February  8,  9,  and  10,  1943. 


Evaluation. — Memory  for  Ships  has  a  moderately  low  validity  for  the 
prediction  of  pilot  success  (average  validity  coefficient  of  0.17),  but  a 
sufficiently  low  correlation  with  the  pilot  stanine  so  that  it  would  add, 
though  by  an  amount  rarely  worth  considering,  to  the  validity  coefficient 
if  used  in  conjunction  with  the  classification  battery. 

Factor  i  nalysis  of  this  test  shows  that  51  percent  of  the  total  variance 
is  accounted  for  by  common  factors,  leaving  a  fair  amount  of  unde¬ 
fined  nonerror  variance.  Of  the  known  variance,  the  perceptual-speed 
factor  accounts  for  8  percent,  the  spatial-relations  factor  for  10  percent, 
and  the  paired-associatcs-mcmory  factor  for  25  percent.  The  remaining 
8  percent  of  the  total  variance  is  accounted  for  by  factors  on  which  the 
loadings  arc  quite  low.  That  the  spatial-relations  factor  accounts  for  10 
percent  of  the  common  variance  of  this  test  seems  unusual,  but  the  fact 
that,  in  memory  of  naval  ships,  spatially  complicated  structures  are  in¬ 
volved  seems  reasonable  rationalization  for  it. 

An  estimate  of  the  pilot  validity  for  this  test  (see  table  28.18)  is  0.20, 
which  is  nearly  the  same  as  the  empirical  value  (0.17).  This  indicates 
that  all  factors  valid  for  the  pilot  have  been  accounted  for.  This  test, 
like  Memory  for  Plane  Silhouettes,  has  relatively  high  loadings  on  the 
perceptual  and  spatial-relations  factors  and  thus  is  not  as  pure  a  measure 
of  the  paired-associates  factor  as  is  Memory  for  Landmarks. 

SYMBOLIC  MEMORY  TESTS 

The  second  main  category  under  which  memory  tests  arc  grouped  is 
the  memory  for  verbal,  or  more  accurately,  symbolic  material.  Here  also 
the  tests  falling  in  this  classification  may  be  divided  into  those  that  in¬ 
volve  the  ability  to  remember  and  to  recognize  complex  wholes  and  re¬ 
lations  of  parts,  and  those  that  require  the  remembering  of  simple 
wholes  (paired  associates).  The  former  includes  tests  in  which  the 
stimulus  is  auditory  and  those  in  which  the  stimulus  is  printed.  The  lat¬ 
ter  group  as  represented  here  consists  only  of  tests  with  a  verbal 
(printed)  stimulus  end  response. 
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FIGURE  ll.t 

PORTIONS  OF  A  STUDY  PAGE  AND  TEST  ITEMS  OF 
MEMORY  FOR  SHIPS,  C1504AXI 


Rationale 

Memory  for  symbolic  material  is  more  generally  involved  in  learning 
and  in  training,  and  is  thus  perhaps  more  important  than  memory  for 
pictorial  material.  The  general  rationale  behind  this^arca  hardly  needs  to 
be  elaborated  and  is  covered  in  the  earlier  part  of  this  chapter.  The 
necessity  for  this  type  of  memory  does  not  cease  upon  completion  of 
training,  but  continues  throughout  combat.  Briefing  for  a  combat  mission 
involves  not  only  pictorial  material,  but  also  verbal  material,  which  sup¬ 
plements  the  pictorial  material.  Much  of  this  material  is  presented  orally 
as  well  as  in  printed  form.  The  auditory  presentation  of  complex  mate¬ 
rial  is  represented  by  the  Memory  for  Tactical  Plans  test,  while  the  writ¬ 
ten  presentation  of  complex  material  is  represented  by  the  Geographical 
Memory  Test.  Memory  for  simple  symbols  (paired  associates)  is  rep¬ 
resented  by  the  Memory  for  Plane  Designation  test,  which  measures  a 
basic  type  of  learning  and  memory." 

Memory  for  Tactical  Plans,  CI509BX  ** 

This  is  a  verbal,  auditory-memory  test  designed  to  measure  ability  to 
remember  meaningful  material  (instructions)  over  a  longer  term  than 
used  in  immediate-memory  tests.  The  stimulus  is  presented  auditorily, 
and  the  response  is  made  to  printed  questions.  There  arc  three  closely 
similar  forms  of  this  test.  The  original  form,  CI509AX,  was' subjected 
to  item  analysis  and  revised  to  provide  form  CI509BX.  With  but  ex¬ 
tremely  slight  changes,  this  form  was  phonographically  recorded  and 
called  form  CI509C. 

Description.  ( 1 )  Interna!  characteristics. — The  examinees  arc  read  a 
summary  of  briefing  data  for  a  mock  bombing  mission.  About  2  hours 
later,  after  other  tests  have  been  interposed,  10  simple  memory  ques¬ 
tions,  divided  into  two  parts  of  20  items  each,  concerning  the  briefing 
data  are  asked.  A  sample  paragraph  of  the  briefing  data  and  correspond¬ 
ing  items  are  as  follows : 

Major  Carpenter’s  flight  will  follow  4  minutes  behind  Major  Wilson's  flight  at 
an  altitude  of  21,000  feet.  They  will  carry  500-pound  bombs  and  incendiaries.  Major 
Carpenter’s  flight  will  have  the  additional  assignment  of  photographing  the  bombed 
,  area. 

Sample  items  arc  as  follows : 

Major  Carpenter’s  flight  will  carry: 

A.  100-pound  bombs  and  incendiary  bombs. 

It.  500-pound  bombs  and  incendiary  bombs. 

C  1000-j>ound  bombs  and  incendiary  bombs. 

D.  2000-pound  bombs  and  incendiary  bon  Tbs. 

E.  Block  busters. 

Major  Can>cn!c,*’s  flight  has  the  assignment  of: 

A.  Attacking  tlie  troop  loading  zone. 

It.  Attacking  (lie  roundhouse. 

C.  Attacking  the  ward tousc. 

M  Dndopnt  *1  riirlelosktl  Rnoni  Unit  No.  J.  Chief  contributor):  Copt.  Milton  Burdmoin, 
Copt.  Ilorry  Rowobtrf. 
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D.  Photographing  the  bombed  area. 

E.  Bombing  the  alternative  objective. 

(2)  Administration. — The  pertinent  parts  of  the  directions  preceding 
the  briefing  are  as  follows: 

.  .  .  .  later  in  the  day  you  will  be  asked  to  answer  questions  based  upon  what 
you  hear  now. 

Assume  that  you  arc  a  member  of  a  flight  which  is  to  take  part  in  a  bombing 
raid.  You  are  listening  to  the  instructions  of  your  flight  commander. 

The  interim  between  the  briefing  and  the  written  questions  varied 
from  2  to  3  hours  among  the  different  forms.  The  directions  and  brief¬ 
ing  for  CI509C  were  phonographically  recorded  in  an  effort  to  achieve 
greater  standardization.  Total  testing  time  is  approximately  25  minutes, 
with  10  minutes  allowed  for  the  directions  and  briefing. 

(3)  Scoring. — The  scoring  formula  is  R— W/4  in  all  forms. 
Statistical  results. — All  the  'data  given  below  arc  for  examinees  at 

Psychological  Research  Unit  No.  3. 

(1)  Distribution  statistics.— Typical  examples  of  distribution  statis¬ 
tics  obtained  on  this  test  are  given  in  table  11.14.  The  distribution 
curves  are  approximately  normal. 


Table  1 1 .14. —  Distribution  constants  for  Memory  for  Tactical  Plans,  CI509 


*  In  das*  MD.  Testing  date*  not  reported. 

•In  das*  43).  Tested  January  9,  II,  13,  IS,  and  Ids  19431 


(2)  Internal  consistency. — The  degree  of  homogeneity  of  the  items 
(in  form  CI509nX)  is  indicated  by  a  mean  internal-consistency  phi  of 
0.36,  a  standard  deviation  of  the  phi  distribution  of  0.12,  and  a  range  of 
values  from  0.02  to  0.59.  These  statistics  are  based  upon  analysis  of  the 
responses  of  the  highest  27  percent  and  the  lowest  27  percent  in  total 
score  of  a  group  of  700  unclassified  aviation  students. 

(3)  Reliability  coefficient. — By  the  alternate-forms  method,  an  esti¬ 
mated  reliability  coefficient  of  0.68,  corrected  for  length,  was  obtained 
for  the  AX  form.  This  figure  is  based  on  a  sample  of  500  unclassified 
aviation  students  tested  in  October  1942. 

(4)  Difficulty. — Based  upon  item  analysis  of  the  responses  of  700  un¬ 
classified  aviation  students,  the  test  (form  CI509BX)  yielded  a  mean 
proportion  of  correct  responses  of  0.54,  corrected  for  chance,  with  a 
range  from  0.08  to  0.90  and  a  standard  deviation  of  0.19. 

(5)  Factorial  composition. — The  most  significant  loadings  of  the  AX 
form  are  in  the  verbal  (0.5 7)  and  visualization  (0.32)  factors.  The 
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ccmrnunaHty  is  only  0.47.  For  a  fuller  picture  of  the  factorial  composi¬ 
tion  of  this  test,  sec  appendix  B. 

(6)  Test  validity. — Validation  results  based  on  several  samples  are 
given  in  table  11.15. 


Tabus  11.15. —  Validity  data  far  Memory  for  Factual  Plane,  CI5C9BX 
(graduation-iiimination  used  at  criterion  in  all  tamftee) 


Group 

N, 

M. 

M. 

SO, 

rHt 

,'x.’ 

Pilot*  in  primary  training1  . . 

Pilots  in  basic  training* . 

$70 

9.S4 

22.1$ 

19.96 

7.0$ 

0.17 

0.20 

477 

.91 

22.18 

21.67 

6.at 

.04 

.16 

Pilota  through  basic  training* . . . 

570 

.76 

2?.  IS 

2W.5J 

7.0$ 

.14 

.17 

Pilots  in  primary  training*  . . 

790 

— i. 

i  0  4 

i  '9 

35.9J 

2S.29 

6.4$ 

.20 

.19 

*  Assuming  an  unrestricted  alanine  standard  deviation  of  1.99. 

'Same  sample  followed  through  primary  ar.d  basic  training  in  class  4JJ.  Tested  January  1941. 
1  In  class  s4D,  Totting  dates  not  reported. 


Evaluation. — Since  the  correlation  of  this  test  with  the  pilot  stanine 
is  low,  in  spite  of  rather  low  validity,  it  would  be  of  some  value  to  the 
classification  batiery.  Other  features,  however,  such  as  the  difficulty  of 
administration,  made  impractical  its  inclusion  in  the  battery. 

Since  this  test  was  designed  to  measure  a  relatively  long-term  type  of 
memory,  it  is  unique  among  the  memory  tests.  It  docs  not  appear  to  be 
loaded  with  any  of  the  memory  factors  in  common  with  short-term  mem¬ 
ory  tests.  Factor  analysis  shows  that  only  47  percent  of  the  total  variance 
of  the  test  is  accounted  for  by  common  factors,  compared  with  a  reli¬ 
ability  of  0.68.  Of  this,  the  verbal  factor  accounts  for  32  percent  and  the 
visualization  factor  for  10  percent.  This  is  to  be  expected,  since  the 
material  h  presented  orally  and  tested  with  questions  involving  the  rela¬ 
tive  positions  of  three  flights  attacking  a  target  The  remaining  5  per¬ 
cent  of  the  total  variance  accounted  for  by  the  analysis  is  on  factors  on 
which  the  loadings  are  quite  low. 

Examination  of  the  duties  of  air  crew  indicates  that  this  type  of 
memory  should  be  important.  Since  this  is  the  only  test  measuring  more 
than  immediate  memory,  and  since  it  has  validity,  further  development 
would  be  worth  while.  A  study  of  its  unknown  valid  factor,  or  factors, 
would  be  profitable,  for,  when  this  variance  is  properly  identified,  a 
more  unique  test  without  verbal  variance  might  be  constructed.  Its 
validity  for  pilot  selection  is  very  largely  unaccounted  for  by  known 
factors.  Its  average  obtained  validity  is  0.19,  whereas  that  expected  from 
known  factors  is  only  0.06.  Between  these  two  values  lies  rich  unex¬ 
plored  territory. 

Geographical  Memory*  0508 AX  ** 

Tlris  is  a  symbolic  memory  test  involving  the  relationships  of  parts  to 

» Dr  »«lep*6  *1  PmbtlogitoJ  *«««•«*  Uolt  Ho.  ».  Chi*l  tomr^atoni  T/V*.  C. 
Ditit,  Ll  L!m  Bmowm 
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a  complex  whole.  Both  the  stimulus  and  the  response  part  of  the  prob¬ 
lems  arc  presented  in  printed  form. 

Description.  (1)  Internal  characteristics. — In  this  test,  written  de¬ 
scriptions  of  geographical  areas,  each  approximately  a  typewritten  page 
long,  are  presented  for  the  examinees  to  study.  At  the  end  of  each  study 
period,  questions  arc  asked  about  the  location  of  important  points,  their 
direction  and  distance  from  each  other,  and  details  about  transportation 
routes.  Answers  to  some  of  the  questions  are  not  specifically  stated  in  the 
description,  but  can  be  determined  from  the  information  given.  The  test 
consists  of  descriptions  of  two  geographical  areas  with  a  cluster  of  20 
items  concerning  the  first,  and  25  items  concerning  the  second  descrip¬ 
tion.  A  paragraph  from  the  geographical  description  and  corresponding 
item  are  presented  below. 

The  Olson- Van  Ruyan  Marine  Engine  Corporation  factory  is  near  the  northwest 
comer  of  the  bay.  It  is  served  by  a  single-track  railroad,  coming  from  the  north.  A 
two-lane  highway  runs  along  the  north  shore  of  the  bay  from  the  Olson-Van  Ruyan 
factory  to  Warrenton  at  the  nort’ieast  comer  of  the  bay.  Warrenton  also  extends 
along  the  east  bay  sliore  for  about  4  miles.  Commercial  docks  extend  for  about  3 
miles  midway  on  the  east  bay  shore.  An  oil  pipe  line,  bringing  oil  from  wells  farther 
east,  terminates  at  the  docks.  Near  the  east  end  of  the  south  shore  of  the  bay  is  the 
Great  Western  Shipyard,  and  near  the  western  end  is  the  Admiralty  Seaplane  Base. 

The  distance  from  the  Olsoo-Van  Ruyan  factory  to  the  seaplane  base  is  about: 

A.  2  mOes. 

B.  S  miles. 

C  8  mile*. 

D.  11  mkx 

E.  IS  miles. 

Goods  arriving  at  the  Olson-Van  Ruyan  Marine  Engine  factory  from  the  north 
would  likely  come  by: 

A.  Truck  over  the  4-lane  highway. 

U.  Single  track  railroad. 

C  Doable  track  railroad. 

D.  Truck  over  the  2-lane  highway. 

E.  Truck  over  the  3-lane  highway. 

(2)  Administration. — The  directions  for  ihc  lest  are  as  follows: 

This  is  a  test  of  your  ability  to  remember  details  of  a  geographical  description. 
You  will  have  7  minutes  lo  study  a  written  description  of  a  geographic  area.  At  the 
end  of  that  time  you  will  be  asked  questions  based  upon  the  description.  You 
may  had  it  helpful  to  imagine  a  map  of  the  area  described. 

You  should  note  especially  the  following  characteristics  of  the  are*. 

1.  Location  of  important  details. 

2  Direction  of  important  points  from  each  other. 

X  Distance  of  important  points  from  each  other. 

4  Location  and  details  of  transportation  routes  .  . . 
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As  indicated  above,  7  minutes  are  allowed  for  study  of  the  descrip- 
tion  on  each  part.  Ten  minutes  arc  given  for  selection  and  marking  of 
answers  on  the  first  part  and  12  minutes  for  the  second  part.  The  total 
testing  time  is  40  minutes. 

(3)  Scoring. — The  scoring  formula  is  R— W/4. 

Statistical  results. — The  data  are  for  examinees  tested  at  Psychological 
Research  Unit  No.  3  on  October  19  and  20,  1942. 

(1)  Distribution  statistics. — Typical  examples  of  distribution  statis¬ 
tics  obtained  on  this  test  arc  given  in  table  11.16.  The  distribution  curves 
are  approximately  symmetrical  and  somewhat  flatter  than  normal. 


Table  11.16. —  Distribution  constants  for  Geographical  Memory, 

Cl 508 AX 


Group 

Part 

N 

u 

SD 

Tout 

I 

IBS 

12.9 

9.2 

221 

57 

4.5 

II 

221 

L« 

6.1 

1 

(2)  Reliability  coefficient. — By  the  alternate- forms  method,  an  esti¬ 
mated  reliability  coefficient  of  0.74,  corrected  for  length,  was  obtained. 
This  figure  is  based  on  a  sample  of  250  unclassified  aviation  students. 

(3)  Test  validity. — Validation  results  based  on  several  samples  are 
given  in  table  11,17. 


Table  11.17. —  Validity  data  for  Geographical  Memory,  CI508AX,  based  upon  a 
sample  of  223  pilots  ( graduation-elimination  in  primary  training  used  as  criterion; 

P.=W 


Part 

SD, 

flll 

I 

s.ss 

5.37 

4.14 

006 

II 

a.  67 

7.17 

6.12 

.31 

Evaluation. — This  test  had  been  intended  to  resemble  Map  Memory, 
CI505AX1,  in  c.”  respects  except  that  the  geographical  material  was 
presented  in  verbal  rather  than  visual  form.  Presenting  the  material  ver¬ 
bally  caused  a  high  correlation  with  the  verbal  test,  Reading  Comprehen¬ 
sion  (0.43),  which  is  in  the  classification  battery.  This  would  mean  a 
very  high  loading  of  the  verbal  factor  in  this  test,  possibly  as  high  as 
0.70.  This  fact  probably  indicates  a  lack  of  factorial  resemblance  to  Map 
Memory,  and  so  i*  was  felt  further  development  of  this  test  is  not  worth 
while. 
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Memory  for  Plano  Designations,  CI507AX  *• 

Thia  is  a  paired-associates  test  utilizing  symbolic  material  in  literal 
and  verbal  form  for  the  stimulus  and  the  response  respectively. 

Description.  (1)  Internal  characteristics. — The  test  consists  of  2  parts 
of  20  items  each.  In  each  part  there  is  a  study  page  on  which  the  names 
of  20  hypothetical  airplane  manufacturers  are  given,  each  paired  with 
a  three-letter  symbol,  somewhat  like  those  given  by  the  Navy  for  plane 
designation.  For  example: 


P-YD  . O’Rourke. 

P-LC  . . Inman. 

P-ZI . Brennerman. 


The  first  letter  is  the  same  throughout  the  part,  but  no  combination 
of  the  last  two  letters  are  similar  for  different  associates.  After  a  study 
period,  the  examinees  turn  to  the  response-page  on  which  the  20  symbols 
are  given  with  five  names  listed  below  each  one.  A  sample  item  follows: 

P-LC 

&  Dalton 
b.  O’Rourke 
e.  Brcontnnan 
d.  Inman 
a.  Powers 

The  task  of  the  examinees  is  to  select  from  the  five  choices  that  name 
that  has  been  paired  with  the  symbol  on  the  study  page.  A  later  form, 
CI507BX,  is  similar,  except  the  symbols  are  paired  with  plane  names 
instead  of  the  names  of  manufacturers. 

(2)  Administration. — Six  minutes  are  allowed  for  the  study  period 
and  the  same  for  selecting  and  marking  the  answers.  The  testing  time 
is  approximately  25  minutes. 

(3)  Scoring. — The  scoring  formula  is  R— V//4. 

Statistical  results — The  data  given  below  are  for  examinees  tested  at 
Psychological  Research  Unit  No.  3  on  October  13,  15,  and  17,  1942. 

Distribution  statistics. — Typical  examples  of  distribution  statis¬ 
tics  obtained  on  CI5C7BX,  the  later  form  of  this  test  mentioned  above, 
are  given  n  table  11.18.  The  distribution  curves  are  approximately  sym¬ 
metrical  and  somewhat  flatter  than  normal. 


Ta>x*  11.18.—  Distribution  coiulentt  for  Memory  tor  Plot*  Dtnguolitmi, 
CISC7BX  bottd  upon  •  sample  of  ItS  wulauified  aviation  stuirmti 


rut 

M 

so 

it 

7.1 

IL( 

a.? 

M 

44 

.......... r .  r  t . 

■  DntM  a  h)(k  «kal  Utmor ck  U»U  X*.  i.  CM  fmtr&mtmi  C«T  Utrrj  iMtalva. 


(2)  Reliability  coefficient. — By  the  alternate-forms  method,  an  esti¬ 
mated  reliability  coefficient  of  0.82,  corrected  for  length,  was  obtained. 
This  figure  is  based  on  a  sample  of  367  unclassified  aviation  students. 

(3)  Test  ralidity. — A  sample  of  348  pilots  yielded  a  Liscrial  correla¬ 
tion  of  —0.03  between  performance  in  this  test  and  the  graduation- 
elimination  criterion  in  primary  pilot  training.  The  mean  score  for  grad¬ 
uates  was  8.33,  for  eliminecs  8.56,  and  the  standard  deviation  for  both 
combined  was  5.18.  Of  this  sample,  67  percent  were  graduates. 

Evaluation. — Memory  for  Plane  Designation  docs  not  correlate  to  any 
degree  with  any  of  the  classification  tests,  the  highest  correlation  being 
with  Reading  Comprehension  (0.24)  and  with  Numerical  Operations 
(0.24).  This  test  would  be  expected  to  have  some  degree  of  relationship 
with  the  former  test  because  of  the  emphasis  on  verbal  symbolism  in 
both  of  them.  The  low  pilot  validity  coefficient  of  this  test  is  consistent 
with  its  verbal  and  numerical  content.  Although  it  has  not  been  factor 
analyzed,  it3  zero  validity  leads  us  to  expect  no  significant  variance  in 
any  valid  factor.  Because  of  these  considerations,  no  further  develop- . 
merit  o i  the  test  was  undertaken. 


FACTOR  ANALYSIS  OF  MEMORY  TESTS  ” 

In  order  to  obtain  a  clearer  picture  of  the  memory  area  and  of  the 
memory  tests  developed  in  the  aviation-psychology  program,  two  factor 
analyses  were  made  of  all  the  mcinory  tests  that  were  ready  in  the  sum¬ 
mer  of  1942. 

TIm  Data 

Two  batteries  were  administered  in  order  to  obtain  the  intercorreU- 
tions  for  use  in  the  analysis.  Twelve  tests  appear  in  battery  I,  five  of 
wh;ch  were  memory  tests  and  the  remaining  were  tests  selected  from  the 
classification  battery.  The  second  battery  included  13  tests,  6  of  which 
were  memory  tests  and  the  rcmaiii.'ng  were  the  same  classification  tests 
used  in  battery  I.  Thurstonc's  centroid  method  with  rotation  of  axes  was 
employed.  The  sample  for  battery  I  was  composed  of  179  classified 
bombardiers,  and  for  battery  II  of  238  unclassified  aviation  students. 
In  both  cases  the  range  of  talent  was  practically  unrestricted  by  selec¬ 
tion  other  than  that  produced  by  the  AAF  Qualifying  Examination.  In 
both  correlation  matrices,  the  intercorrclations  of  classification  tests  were 
based  on  527  unclassified  students,  assuming  that  they  were  comparable 
to  the  two  special  samples  and  with  the  belief  that  much  more  sUc  * 
data  were  thus  obtained.  The  intercorrclations  for  the  two  batteries  are 
presented  in  tables  11.15  and  11.20.  the  centroid  loadings  in  tables  1121 
and  1 1  22,  and  the  rotated  factor  loadings  in  tables  1123  and  1124. 

•  \r  s/s*«-  J  r,mku*-  a 

Haapfcrrr*,  m4  Set-  **•  **•«*. 
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t 


Te*t 

I 

II 

III 

IV 

V 

VI 

VII 

h* 

I.  Speed  of  Identification' . 

64 

14 

02 

21 

-01 

-01 

-01 

47 

2.  Spatial  Orientation  I  . 

61 

04 

-04 

28 

-07 

13 

18 

51 

.1.  Spatial  Orientation  II  . . 

61 

16 

07 

Hi 

—01 

12 

38 

57 

4.  SAM  Complex  Coordination  . . 

22 

11 

06 

46 

16 

06 

It 

32 

3.  Mechanical  Comprehension . 

14 

27 

-05 

36 

27 

HI 

29 

38 

0.  Reading  Comprehension . 

67 

21 

12 

39 

34 

73 

7.  Arithmetic  Reasoning  . . 

BTl 

35 

20 

Bol 

47 

22 

07 

44 

0.  Map  Memory,  CISOSAXl  . 

35 

42 

09 

vfl 

54 

26 

70 

12.  Map  Memory,  CIS05AX3  . 

1« 

31 

IS 

21 

14 

55 

31 

61 

12.  Memory  for  Plane  Silhouettes . 

41 

07 

50 

43 

-07 

K1 

63 

13.  Memory  for  Landmarks  . . 

14 

25 

56 

12 

-04 

21 

-13 

47 

14.  Memory  for  Tactical  Plana  . 

-02 

57 

10 

08 

09 

-12 

32 

47 

1  Decimal  point!  have  been  omitted. 
‘For  code  number*  ace  table  11.21. 


Table  11.24. —  Rotated  factor  loadings  for  Memory  Battery  IP 


Tost 

i 

II 

1H 

IV 

V 

VI 

VII 

VIII 

Jk» 

1.  Speed  of  Identification'  . . 

66 

-04 

08 

17 

-04 

?? 

09 

-09 

539 

2.  Spatial  Orientation  1  . 

62 

-04 

24 

HI 

18 

20 

11 

533 

3.  Spatial  Orientation  II  . 

63 

25 

11 

09 

11 

HI 

502 

4.  SAM  Complex  Coordination  ... 

22 

06 

03 

52 

HI 

07 

12 

369 

5.  Mechanical  Comprehension  .... 

23 

36 

07 

33 

£ 

10 

IS 

—  14 

361 

6.  Reading  Comprehension  . . 

-01 

52 

18 

26 

40 

12 

—03 

00 

546 

7.  Arithmetic  Reasoning . 

-OS 

27 

—02 

15 

68 

14 

590 

9.  Map  Memory,  C150SBX1 . 

22 

05 

41 

HI 

23 

52 

05 

07 

556 

10.  Mao  Memory,  CI30SAX2 . 

35 

23 

14 

16 

17 

58 

HI 

—02 

591 

12.  Memory  for  Plane  Silhouettes  , 

25 

26 

62 

32 

—  11 

20 

E£1 

679 

13.  Memcrv  for  Landmark* . 

1/ 

-ki 

06 

U2 

33 

19 

44 

-w 

824 

29 

13 

58 

10 

02 

51 

—04 

713 

29 

-08 

SO 

31 

10 

06 

20 

13 

507 

»  Decimal  point!  here  been  feme-  *37 
•  For  code  numtrrs  Kt  table  11.22. 
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Results 

The  results  of  the  analysis  of  the  two  batteries  will  be  summarized 
together,  because  many  of  the  tests  overlap  and  identical  factors  were 
extracted  from  the  two  &ets  of  iutcrcorrelations.  A  test  is  listed  under 
a  factor  if  a  load'ng  of  0.?Q  or  higher  is  attained  in  either  analysis. 

Rotated  factor  I  is  the  well-verified  percept  ial-specd  factor  in  that 
the  Speed  of  Identification  and  the  Spatial  Orientation  I  and  II  tests  ap¬ 
pear  with  by  far  the  greatest  loadings.  Test  loadings  on  this  factor 
are  as  follows  ’ 


Tut 

number 


t 

2 

3 

S3 

8 

so 

15 

16 


Tot  name 


Speed  of  Identification,  CP610A  ......... 

Spatial  Orientation  I,  CP50IB  . 

Spatial  Orientation  II,  CP503B  . ....... .. 

Memor*  for  Wane  Silhouette!,  CI501AX1 

Map  Memory,  CI505AXI  . . 

Map  Memory,  CISOSAjy  . . 

Plane  Name  Memory,  CI505AXI  . 

Memory  for  Ship*,  CI50IAXI . 


Loading* 


0.64 
.61 
.61 
•4.1 
.35 
».... 
•  •  •  • 


It 


0.66 

.63 

.6) 

.35 

w 

i I 

.29 


»Por  thi*  and  the  following  factor*,  blank*  Indlcato  the  abaci**  cf  a  tea*  la  •  port!  «l*r 

tnalyola. 
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Rotated  factor  II  has  significant  loadings  in  the  following  tests: 


Teti 

number 


T«»t 

i 

X. 

0  Reading  Comprehension,  AC10D  ..... 
14  Memory  I  or  Tactical  Plans,  C15O0AXI 

I  Map  Memory,  C1S09AXI  . 

7  Arithmetic  Reasoning,  C12Q6B . 

11  Map  Memory,  CiSOSAXJ  . . 

S  Mechanical  Comprehension,  AC10D  ... 


Loading* 


AST 

.it 

.4* 

JJ 

41 

JO 


641 


•  ••  # 

JO 


This  is  obviously  the  verbal  factor  found  in  most  analyses.  While 
there  arc  certain  variations  in  the  loadings  of  memory  tests  on  this  fac¬ 
tor,  it  seems  clear  that  of  these  tests  Memory  for  Tactical  Plans  is  the 
most  verbal.  The  map-memory  tests  are  next  most  heavily  loaded  with 
verbal  comprehension.  The  average  verbal  loadings  of  the  other  memory 
tests  are  not  appreciable. 


The  following  data  define  rotated  factor  III : 


Tnt 

T«tf  MM 

Lm*« 

number 

i 

n 

12 

ij 

15 

16 

9 

as* 

M) 

M 

M 

Pfane  Name  Memory.  CISMAXl  . . . . 

M 

M 

.41 

Memory  far  SbitM.  £i$0$AXl . . . 

Mao  Memory  CISOSBXl . . . 

t  f  t  ® 

The  tests  high  on  this  factor  are,  with  one  exception,  fundamentally 
of  the  paired-associates  form.  Whether  this  factor  should  be  so  defined, 
that  is,  in  terms  of  the  form  of  memorizing,  is  uncertain.  It  is  perhaps 
broader  than  this  and  could,  on  the  basis  of  present  evidence,  be  called 
a  “rote-memory”  or  an  "associative-memory”  factor.  All  but  one  of  the 
map-memory  tests  have  near  zero  loadings  here.  This  one,  CI505BX1,  is 
heavily  weighted  with  items  based  on  a  schematic  map.  The  detail  on 
this  map  consists  of  names  of  cities,  connecting  roads,  and  mileages. 
Such  material  could  be  expected  to  introduce  a  high  "rote-memory" 
loading,  hence  the  assumption  of  a  general  rote-memory  factor  receives 
some  support 

The  hypothesis  of  an  associative-memory  factor,  however,  better  ac¬ 
counts  for  the  clean-cut  distinction  between  the  two  groups  of  memory 
tests — the  recall  by  association  on  the  one  hand  and  the  recall  by  repro¬ 
duction  on  the  other.  This  hypothesis  needs  the  support  of  other  er'dence 
—finding  the  same  factor  in  tests  of  serial  learning,  for  example.  For 
lack  of  more  cnicial  evidence,  it  seems  best  to  call  this  reference  vari¬ 
able  the  paired-associates  memory  factor,  staying  close  to  the  more 
apparent  characteristics  of  the  tests  defining  it 
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Rotated  factor  IV  is  defined  by  the  following  data : 


Test 

Ten  name 

• 

Loadings 

number 

I 

II 

4 

Complex  Coordination,  CM70IA  . 

#0.46 

.43 

0.52 

12 

Memory  for  Plane  Silhouettes,  CI503AXS . 

.32 

S 

Mechanical  Comprehension.  AC  IOC  . .  * . 

.26 

.33 

16 

Memory  for  Ships,  CI504AXl  . . . 

.21 

This  is  undoubtedly  the  spatial-relations  factor,  which  in  this  and 
other  analyses  has  been  defined  by  the  Complex  Coordination  test.  Mem¬ 
ory  for  Plane  Silhouettes  and  Memory  for  Ships  have  moderate  loadings 
with  this  factor.  It  is  evident  that  these  tests  arc  so  constructed  that 
persons  high  on  the  spatial-relations  factor  arc  aided  in  memorization 
of  this  material. 


The  following  tests  have  significant  loadings  on  rotated  factor  V : 


Test 

Test  name 

Loadings 

I 

II 

y 

0.47 

0.66 

6 

.29 

.40 

i) 

-.04 

.22 

This  is  most  probably  the  general-reasoning  factor  always  found  in 
Arithmetic  Reasoning  and  Reading  Comprehension.  The  discrepancy  in 
loading  for  Memory  for  Landmarks  is  quite  unusual.  In  view  of  the  ab¬ 
sence  of  this  factor  in  other  memory  tests,  one  is  led  to  suspect  that  the 
zero  loading  here  is  more  nearly  correct 

Rotated  factor  VI  is  defined  by  the  following  data : 


Test  * 

Teal  name 

Loadings 

I 

11 

$ 

0.5-4 

11 

Ms#  Memory'  ClSOiAXi . 

.51 

• 

Map  Memory!  CISOSBXt  . 

an 

10 

Map  Me—~y|  DUiAXl . : . 

.16 

This  factor  is  restricted  to  map-memory  tests  in  these  analyses.  It 
is  possible,  however,  that  the  factor  is  more  general  than  this.  This  fac¬ 
tor  has  been  called  visual  niemory  in  view  of  the  obvious  visual  conte  nt 
of  the  tests  that  define  it.  It  could  be  hypothesized  that  this  is  a  more 
general  reproductive-memory  variable,  but  evidence  is  lacking  as  to 
whether  it  would  be  held  in  common  by  auditory  and  other  types  of 
memory  tests,  hollowing  these  analyst’s,  additional  tests  such  as  Plane 
Formation  Memory,  CI513A,  and  Plane  I'twilion  Memory,  CI512A,  were 
constructed  for  the  purpose  of  purifying  tests  for  the  supposed  visual- 
memory  factor  ami  for  further  clarification  of  this  hypothesis  by  analysis. 
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Rotated  factor  VII  is  defined  by  the  following  data  from  battery  I 
only: 


Tc«t  number 

TtM  nimt 

Loam  n* 

t 

n 

3 

0.31 

14 

.32 

* .  *  • 

it 

s 

Map  M,m«irr,  CM0SAX3 . 

Mechanical  Comprehenaton,  ACIOI) . 

.* 

•••• 

Though  this  factor  is  not  well-defined,  it  is  probably  visualization. 
Other  analyses  have  shown  Spatial  Orientation  II,  CPS03I1X,  to  have  a 
moderate  loading  in  this  factor.  Most  mechanical-comprehension  tests 
have  moderate  to  high  visualization  loadings.  Although  Memory  for 
Tactical  Plans  involves  delayed  memory  for  auditorially  presented  verbal 
material,  it  is  easy  to  rationalize  a  visualization  component  in  the  test. 
Comprehension  of  the  described  mission  and  the  answering  of  verbal 
questions  about  it  could  both  involve  some  visualization  to  good  ad¬ 
vantage. 

It  is  of  considerable  psychological  importance  to  find  that  there  is  a 
clear-cut  hiatus  between  reproductive  visualization  or  visual  memory 
(factor  VI)  and  another  type  of  visualizing  which  appears  to  require 
manipulation  and  so  may  be  called  manipulative  visualization.  More 
discussion  of  this  point  will  be  found  in  the  chapter  on  visualization 
(eh.  12). 

Rotated  factor  VIII  is  defined  by  the  following  data  from  battery  II 
only  : 


Te«t  minuter 

Teal  name 

|  Loading 

i 

a 

IS 

Pline-Nanic  Memory,  CJS06AXI  .  . 

0.51 

IJ 

Memory  for  Landmarks,  CIS10AXI  . 

.44 

a... 

This  factor  seemingly  constitutes  an  unimportant  doublet  of  the  two 
memory  tests  that  arc  most  similar  to  each  other.  Except  for  differences 
in  subject  matter,  these  two  tests  are  practically  identical  measures  of 
the  same  basic  functions.  The  correlation  between  the  two  tests  is  0.69. 
Doth  involve  the  pairing  of  relatively  simple  figure  and  verbal  symbols. 
It  is  not  unreasonable  to  speculate  that  in  addition  to  a  general  paired- 
associalcs  memory  ability  (factor  III  above)  common  to  all  tests  in 
which  items  of  information  arc  memorized  by  pairs,  there  is  also  for 
each  type  of  pair  a  more  restricted  ability  to  leant,  retain,  and  recall. 

Should  there  also  be  an  associative-memory  factor,  even  more  general 
than  that  for  memorizing  in  pairs  (paired-associate  factor),  a  complex 
hierarchy  of  memory  abilities  wotild  exist  with  separable  variables  of 
different  degrees  ot  generality.  The  structure  of  memory  abilities  is 
thus  seen  to  need  a  thorough  investigation  from  the  factorial  approach. 
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^  Rotated  factor  IX  in  memory  battery  II  is  seemingly  a  true  residual 

$  factor.  The  ninth  centroid  was  used <  in  the  rotations  of  the  other  eight, 

b  When  the  rotations  were  completed,  factor  IX  had  a  smaller  share  of 

f  the  total  variance  in  the  battery  than  it  had  before  rotations  were  started. 

I  Since  the  centroid  method  does  not  extract  the  maximum  common  vari¬ 

ance  possible  with  each  successive  factor,  such  an  outcome  is  not  un- 
|  expected. 

t  * 

l  Conch 


(  In  conclusion,  it  can  be  definitely  stated  that  two  general  memory  fac¬ 

tors  have  been  isolated.  A  third  memory  factor  specific  to  a  particular 
'  type  of  test  has  also  been  isolated.  It  may  be  that  memory  factors  of 

limited  scope  can  be  multiplied  almost  indefinitely  by  relatively  slight 
ciiangcs  in  the  format  or  content  of  memory  tests.  Such  factors  might 
be  useful  in  prediction  within  their  own  areas. 

It  would  seem  that  at  least  one  important  inemo/y  factor  remains  to 
be  described.  Memory  for  Tactical  Plans,  0509,  a  test  of  delayed  mem¬ 
ory  for  orally-presented  material,  docs  not  appear  to  be  weighted  in 
either  of  the  two  general  .memory  factors.  It  is  quite  possible  that  addi¬ 
tion  of  similar  dclaycd-mcmory  tests  to  these  batteries  would  result  in 
the  isolation  of  a  dclaycd-mcmory  factor  of  some  kind,  or,  instead,  one 
^  peculiar  to  auditory-verbal  presentation,  or  both. 
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chapter  remi _ _ 

Visualization  Tests1 


INTRODUCTION 

Historical  Jtuckground 

Oik'  of  the  earliest  systematic  attempts  to  investigate  the  field  of  vis* 
uali/.ation  was  a  study  conducted  over  CO  years  ago  by  Sir  Francis 
Gallon.  His  questionnaire  concerning  the  vividness  with  which  objects 
from  the  morning’s  breakfast  table  could  be  imagined,  or  visualized,  is 
a  famous  pioneer  study.  The  field  later  proved  interesting  and  challeng¬ 
ing  to  many  investigators.  At  one  time  it  even  became  the  battlefield  for 
a  fundamental  psychological  controversy  as  to  whether  there  is  any 
thought  process  at  all  that  does  not  involve  seme  kind  of  images,  visual 
or  otherwise. 

Practical  difficulties  have  retarded  experimentation  in  this  field.  Vis¬ 
ualizing  is  a  private  experience,  involving  little  or  no  overt  behavior 
that  can  be  measured.  It  has  been  extremely  difficult,  therefore,  to  de¬ 
vise  objective  experiments  even  to  prove  unequivocally  that  visualiza¬ 
tion  exists.  Such  experiences  as  dreams,  hallucinations,  vivid  memories, 
eidetic  images,  and  so  on  are  common  enough,  however,  to  convince 
most  observers  that  some  process  of  visual  imagination  or  visualization 
does  exist,  even  though  it  defies  most  attempts  at  objective  measurement. 

The  older  attempts  at  quantifying  and  measuring  visualization  were 
aimed  chiefly  at  discovering  the  degree  of  clarity  or  sharpness,  the  per¬ 
sistence,  and  the  frequency  with  which  visualization  took  place.  More 
or  less  implicitly,  these  studies  assumed  that  if  visualization  occurs  at 
all,  the  process  is  always  the  same;  that  is,  the  ability  to  visualize  was 
assumed  to  be  a  single  unitary  ability  rather  than  a  family  of  related 
abilities.  Though  this  assumption  is  natural  and  understandable,  it  is 
certainly  not  the  only  one  that  can  be  made.  It  may  be  that  there  art 
Several  abilities  to  visualize  ;o»J  that  different  tasks  require  different 
visualizing  abilities.  More  recently,  some  of  the  psychological  problems 
involved  have  been  enumerated  as  follows: 

Is  visualizing  flat  forms  the  same  as  visualizing  solid  forms  as  they  would  ap¬ 
pear  from  different  sides?  Is  visualizing  solid  objects  the  same  as  visualizing  move¬ 
ment  of  parts  in  a  diagram  of  a  machine?  Considering  only  flat  forms,  is  the 
same  ability  required  in  visualizing  several  shapes  singly  as  in  visualizing  how 
these  shapes  could  be  fitted  together,  like  jigsaw/  pieces  for  example?  Or  doc*  the 

1  Wriiltn  by  S/Sft.  Warn*  S.  Ziwwmum. 
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latter  laok  require  w>me  additional  kinesthetic  ability  as  well?  These  are  txan>|j|r< 
••I  iiriJilcin*  <d  fitiMlamctilal  interest  to  p»yi+» .logical  ttciettce,  ami  i(  they  rati  he 
Miivid,  the  i i-iill*  will  also  In-  of  ronsiderahte  |irae'iral  (innifiraiur  (I) 

Karly  Factorial  Slntllei 

The  technique  of  factor  analyst*  assumes  the  existence  of  separable 
.liiilities,  and  furnishc*  a  comparatively  objective  tool  for  analyzing 
data  in  the  light  of  this  assumption.  Perhaps  it  is.  from  this  objective 
technique  that  the  most  promising  evidence  is  to  come  for  the  establish¬ 
ment  of  visualization  as  an  indejiendcnt  mental  ability  or  set  of  abilities. 
Both  Kelley  (2)  and  Thurstonc  (1)  found  evidence  in  early  factor  stud¬ 
ies  of  the  ability  to  visualize.  Thurstonc  referred  to  a  factor  which  he 
thought  could  be  characterized  justifiably  as  "visual  anti  spatial  imagery.'’ 
He  noted  that  it  was  probably  tin;  Mine  factor  tiial  had  appeared  pre¬ 
viously  as  a  S(iatia]  factor  in  the  studies  of  Kelley.  For  some  reason,  the 
term  "spatial  ability"  rather  than  "visualization"  was  employed  by  these 
investigators  and  their  followers.  It  is  probable  that  substantial  vari¬ 
ance  of  both  spatial  and  visualization  factors  was  included.* 

Visualization  in  Aviation  Psychology 

Perl  laps  the  first  mention  of  visualization  as  an  important  pilot  ability 
occurred  in  the  study  of  causes  of  failure  of  1,000  cadets  eliminated 
from  pilot  training.  In  a  study  of  faculty-board  proceedings,  it  was 
found  that  in  43  percent  of  the  cases,  deficiency  in  visualization  of  the 
flight  course  was  mentioned  as  a  reason  for  elimination.  Visualization 
of  the  flight  course  was  described  as  the  ability  to  "get  out  of  the  cockpit 
and  fly  the  plane  with  regard  to  the  horizon  and  other  reference  points." 

The  first  appearance  of  a  factor  that  w«j  later  recognized  as  visuali¬ 
zation  was  in  an  analysis  of  a  battery  of  nonverbal  reasoning  tests  at 
Psychological  Research  Unit  No.  3.*  On  one  axis,  the  following  four 
tests  had  loadings  of  0.50  or  more:  Spatial  Visualization  I,  0.58;  Pat¬ 
tern  Comprehension,  0.51;  Mechanical  Principles,  0.50;  Spatial  Visuali¬ 
zation  If,  0.50.  By  inspection,  these  tests  seem  to  have  only  one  clement 
in  common,  that  of  the  visual  manipulation  of  images  in  solving  the 
problems. 

hollowing  this  analysis,  some  of  Thurstonc'*  data  were  reexamined, 
and  an  analysis  was  made  of  19  of  his  tests  based  upon  the  published 
matrix  of  intcrcorrelations.  A  visualization  factor  was  isolated  and  dr¬ 
oned  by  tl»c  following  tests:  Punched  Holes,  0.58;  Form  Board,  0.50; 
Ianccnges  B,  0.45;  Surface  Development,  0.40.  Three  subsequent  analy¬ 
ses,  including  two  of  perceptual  tests  and  one  of  mechanical  tests,  furthc- 
supported  the  belief  in  the  existence  of  a  visualization  factor.  Meehani- 

•  1*  nwnMt’i  pmMtWm4  ml?*i  *Z  M  tuitUn,  ta  iatirtn  tt  il>  Sml  Iwtm  nmli 

■Sal  fanSrr  rwMaai  »f  ranfail  in  12  wwk  «ktf  ISwIiiril  uui  naU  mdwt  a  pramawi 
wliniw  Ihw.  vwk  NmM  Hat**,  t  A,  aa4  hfa  Zarw  U*ar4,  »  Ikai  ar4*r. 

WWI'M  nh  iW  M  naiSnat  Uiifrwo. 

•  Sn  rSipMr  Z  far  lW  oigptnr  >r— linn  af  tSia  mfoiit  ' 
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cal  Principles,  Map  Distance,  Mechanical  Comprehension,  an«l  Shortest 
Path  defined  the  factor  in  these  analyses. 

Tests  that  have  Ixvn  found  to  measure  some  phase  of  the  ability  to 
visualize  are  described  and  discussed  in  this  chapter  under  the  sub¬ 
headings  (1)  Visual  Manipulation  and  (2)  Visual  Completion. 

VISUAL  MANIPULATION  TESTS 

Tests  described  within  this  group  have  in  common  problems  that  ap¬ 
pear  to  demand  mental  manipulation  of  a  visual  image  or  images.  This 
type  of  visualizing  calls  for  an  ability  to  imagine  the  rotation  of  depicted 
objects,  the  folding  or  unfolding  of  flat  patterns,  the  relative  changes  of 
position  of  objects  in  space,  the  motion  of  machinery,  or  the  maneuver¬ 
ing  of  airplanes  in  space.  In  all  tests  in  this  group,  the  examinee's  task 
is  to  record  the  final  position  or  positions  after  a  visualized  movement 
or  manipulation  has  taken  place. 

Under  this  heading  one  commercial  test  and  eight  experimental  tests 
are  discussed  in  the  order  in  which  they  were  chosen  for  study.  They  are: 
Pattern  Comprehension,  CP803A,  803B;  Spatial  Visualization  II, 
CI203AX1,  203A;  Spatial  Visualization  I,  CI204AX1,  204AX2;  Vis¬ 
ualization  of  Maneuvers,  CI657AX1,  BX1,  CXI,  CX2;  Formation 
Visualization,  CP814A,  814AX2;  Point  Motion  (Crawford-Bcnnett), 
Form  B;  Spatial  Visualization  III,  CP108A;  Position  Visualization  I, 
CP534A;  and  Position  Visualization  II,  CPI  11  A. 

Pattern  Comprehension,  CP803A  4 

This  test  was  adapted  by  the  AAF  from  Tburstonc’s  "Surface  De¬ 
velopment."  Since  it  involved  the  visualized  folding  of  flat  patterns  into 
three-dimensional  objects,  it  might  be  expected  to  be  a  measure  of 
manipulative  visualization.  At  the  time  of  its  adoption,  it  was  of  interest 
primarily  because  surface-development  tests  had  traditionally  been  in¬ 
cluded  in  mcchanical-tcst  batteries. 

Description.— A  flat  pattern  lay-out,  showing  the  outline  in  solid 
lines  and  the  edges  along  which  the  folds  are  to  be  made  in  dotted  lines, 
is  presented  alongside  of  an  isometric  drawing  of  the  three-dimensional 
object  that  would  be  formed  by  folding  the  pattern  correctly.  Certain 
edges  on  the  folded  object  arc  numbered,  while  edges  and  fold  lines  on 
the  fiat  pattern  arc  lettered.  The  examinee's  task  is  to  match  the  num¬ 
bers  and  the  letters.  In  order  that  inside  and  outside  surfaces  will  not 
be  confused,  two  corresponding  adjacent  edges  on  the  folded  figure  and 
on  the  flat  pattern  arc  marked,  one  with  an  X  and  the  other  with  an  O. 

(1)  Internal  characteristics. — The  directions  contain  one  illustration 
of  a  flat  pattern,  the  accompanying  illustration  of  the  three-dimensional 
figure,  and  two  recorded  but  unscored  sample  questions.  The  test  con¬ 
tains  7  patterns  and  32  scored  questions. 

'Devrtotwd  »t  P»y<holo*ttal  Rtturtk  Uait  X*.  J.  Cfcwf  »aalfikui*f*:  T/Sff.  P»«il  C  D«*<* 
and  Lt.  Una  Hulckinta*. 
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(2)  Administration. — All  the  necessary  directions  are  contained  in 
the  booklet.  An  answer  legend  which  shows  five  of  the  letters  from  the 
pattern  listed  as  alternatives  A,  B,  C,  D,  and  E  is  provided  with  each 
item  in  the  booklet.  Fifteen  minutes  arc  allowed  to  complete  the  items 
in  the  booklet.  The  administration  time  is  5  minutes,  making  a  total  re¬ 
quired  testing  period  of  approximately  20  minutes. 

A  sample  item  from  form  CP803A  is  shown  in  figure  12.1.  Following 
are  part  of  the  directions: 


FIGURE  12.1 

SAMPLE  PROBLEM  OF  PATTERN  COMPREHENSION. 

CP603A 

This  is  a  test  to  see  how  quickly  and  accurately  you  can  understand  the  relation¬ 
ship  between  a  pattern  drawing  and  the  object  it  represents. 

In  the  example  are  two  drawings,  one  showing  a  solid  figure  and  the  other 
showing  a  plane  figure.  If  the  solid  at  the  left  is  placed  on  the  figure  at  the  right, 
the  latter  can  be  folded  perfectly  around  the  solid.  The  figure  at  the  right  may 
therefore  be  called  a  pattern  of  the  solid  at  the  left  In  the  pattern  the  area 
bounded  by  dotted  lines  corresponds  to  the  base  of  the  solid.  Two  of  the  edges  of 
the  solid  and  the  two  corresponding  dotted  lines  in  the  pattern  have  been  marked 
X  and  O.  Using  these  two  edges  for  reference,  select  the  edges  of  the  pattens 
which  correspond  to  each  of  the  numbered  edges  of  the  solid  and  mark  the  answers 
opposite  the  problem  numbers  on  your  answer  sheet 

1.  Corresponds  to  (A)  h;  (B)  p;  (C)  f;  (D)  t;  (E)  k. 

2.  Corresponds  to  (A)  t;  (B)  f;  (C)  h;  (D)  k;  (E)  p. 

The  correct  answers  are  I,  A;  2,  D. 

(3)  Scoring. — The  scoring  formula  is  R—W/4. 

Statistical  results. — The  data  given  below  are  for  samples  tested  at 
Psychological  Research  Unit  No.  3  in  March  and  April  1943. 

(1)  Distribution  statistics. — A  sample  of  229  unclassified  aviation 
students  yielded  a  mean  score  of  14.9  and  a  standard  deviation  of  8.3. 
The  distribution  curve  was  approximately  symmetrical  and  consider¬ 
ably  flatter  than  normal 

(2)  Internal  consistency. — The  degree  of  homogeneity  of  the  items 
is  indicated  by  a  mean  internal-consistency  phi  of  0.42,  a  standard  devi¬ 
ation  of  the  phi  distribution  of  0.06,  and  a  range  of  values  from  0.28 
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to  0.58.  These  statistics  are  based  upon  analysis  of  the  responses  of  the 
highest  27  percent  and  the  lowest  27  percent  in  total  score  of  a  group  of 
375  unclassified  aviation  students. 

(3)  Difficulty. — Based  upon  item  analysis  of  the  responses  of  800 
unclassified  aviation  students,  the  test  yielded  a  mean  proportion  of  cor* 
rect  responses  of  0.50,  corrected  for  chance,  with  a  range  of  0.13  to 
0.79  and  a  standard  deviation  of  0.16. 

(4)  Factorial  composition. — The  most  significant  loadings  are  in  the 
visualization  (0.50),  general-reasoning  (0.33),  perceptual-speed  (0.24), 
and  reasoning  II  (0.24)  factors.  The  communality  is  0.55.  For  a  full  pic¬ 
ture  of  the  factorial  composition  of  this  test,  sec  appendix  B. 

(5)  Test  validity. — Validation  results  based  on  two  samples  are  given 


in  table  12.1. 

Tail*  12.1. — Validity  data  for  Pattern  Comprehension  based  upon  graduation- 
elimination  of  pilots  in  primary  training 
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donations  of  the  test. — The  first  form  of  Pattern  Comprehension, 
CP803AX1,  contained,  in  addition  to  the  sample  pattern  and  questions, 
11  patterns  and  72  scored  items.  The  AX2  form  was  made  up  of  the 
items  from  the  original  form  which  proved  ♦-  .lave  the  highest  internal 
consistencies.  The  directions  remained  unchanged.  Seven  patterns  and 
32  questions  were  selected.  The  time  limit  was  reduced  from  30  to  IS 
minutes.  Both  of  these  test  forms  were  mimeographed.  The  A  form  was 
printed  rather  than  mimeographed.  Except  for  revised  directions,  it  is 
identical  with  the  shortened  AX2  test 

In  form  CP803B  *  several  changes  were  incorporated.  In  order  that 
the  X  and  O  designations  on  the  patterns,  which  arc  necessary  to  define 
whether  a  diagram  represents  an  inside  or  an  outside  pattern,  might  be 
eliminated,  all  diagrams  arc  drawn  as  inside  patterns.  The  necessity  for 
including  a  different  answer  legend  for  each  question  was  removed  by 
adapting  the  answers  to  the  15-plncc  IBM  answer  sheet.  A  proportion¬ 
ately  smaller  number  of  questions  per  pattern  is  asked,  30  questions  for 
10  patterns  being  included  for  the  entire  test.  The  test  is  divided  into 
two  equivalent  parts,  separately  timed.  Only  two  patterns  were  retained 
from  the  preceding  forms.  The  newly  constructed  diagrams  are  all  asym¬ 
metrical.  in  contrast  with  the  symmetrical  consti  etion  of  the  original 
patterns,  and  they  also  have  fewer  sides.  Both  ot  ihese  changes  were 
made  in  the  effort  to  eliminate  some  of  the  reasoning  content.  It  was 
hypothesized  that  the  symmetrical  diagrams  afforded  more  opportunity 
for  the  examinee  to  derive  answers  by  noting  reverse  relationships, 

'  Dt'rl'M  it  r-rtk-t-.-iftl  Zufiftk  Vnit  S#  I.  CtUt  e***Hkt»**r»:  CifC  tii;4  & 
Humphrtjr*.  S(L  frw  II.  Mrtv*,  1*4  VS(l  W*y ■**  V  Timinill, 
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which  could  be  reasoned  through  to  a  solution.  No  data  are  available 
on  this  form. 

Evaluation  — Pattern  Comprehension,  originally  selected  for  study  be¬ 
cause  it  was  thought  to  be  related  to  mechanical  ability,  proved  to  be  a 
fairly  good  measure  of  manipulatory  visualization.  The  high  correlation 
of  Mechanical  Principles  and  Pattern  Comprehension  (r=0.42)  is 
largely  due  to  the  saturations  with  this  factor.  A  source  of  dissatisfac¬ 
tion  has  been  the  high  loading  of  Pattern  Comprehension  with  general 
reasoning.  It  is  hoped  that  form  B  will  prove  to  be  somewhat  less  tainted 
with  that  factor. 

Pattern  Comprehension  shows  moderate  pilot  validity,  the  weighted 
validity  coefficient  being  0.16  for  a  total  sample  of  1,081  pilots  on  forms 
A  and  AX2.  From  its  known  factor  loadings  and  their  validity  for  the 
pilot,  one  would  expect  a  validity  of  0.14.  The  predicted  validity  of  the 
AX1  form  is  0.10;  the  obtained  weighted  validity  is  0.09  for  a  total  of 
525  cases. 

It  is  interesting  to  note  that  the  loading  in  the  mechanical-experience 
factor  is  only  0.06,  as  based  on  a  weighted  mean  of  the  loadings  found 
in  two  analyses.  Whatever  virtue  this  test  may  have  for  selection  for 
mechanical  tasks,  therefore,  probably  would  arise  from  its  perceptual- 
speed  and  visualization  loadings.  If  so,  there  are  much  better  tests  for 
that  type  of  s  action. 

Spatial  Visualization  II,  CI203A  • 

.This  test  was  adapted  from  the  Verbal  Cubes  test  which  was  prepared 
before  the  war  by  Col.  J.  P.  Guilford.  It  was  selected  for  study  because 
it  promised  tc  measure  nonverbal  reasoning.  A  battery  of  nonverbal  rea¬ 
soning  tests  was  being  assembled  for  intcrcorrelational  and  factor  analy¬ 
sis  (see  ch.  7).  Verbal  Cubes  was  originally  designed  to  be  a  measure  of 
the  ability  to  manipulate  mental  images.  This  concept  was  only  an  inci¬ 
dental  consideration,  however,  at  the  time  the  revision  was  begun  by 
Psychological  Research  Unit  No.  3. 

Description. — For  each  group  of  items,  the  examinee  reads  a  verbal 
description  of  a  solid  block  of  wood,  its  sides  painted  different  colors, 
which  is  cut  into  smaller  blocks.  The  examinee’s  task  is  to  visualize  these 
cutting  operations  so  that  he  can  answer  questions  about  the  resulting 
numbers  of  Mocks  of  given  size  and  color. 

(1)  Internal  characteristics. — The  directions  contain  one  verbal  prob¬ 
lem  description  accompanied  by  two  recorded  but  unscorcd  sample  ques¬ 
tions.  Parts  I  and  II  each  contain  6  descriptions  and  22  scored  items. 
The  problems  increase  :n  difficulty  toward  the  end  of  each  part. 

(2)  Ail  ministration. — On  each  page  of  the  booklet  is  an  answer 
legend  for  converting  numbers  into  letters  to  correspond  to  the  letters 

•Developed  at  Psychological  Research  Unit  No.  3.  Chief  contributor}:  S/Sgt.  Jacob  G.  Elkin. 
U  David  H.  Jenkins,  and  S/Sct.  Wayne  S.  Zimmerman. 
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on  the  15-p!ace  answer  sheet.  Thirteen  minutes  arc  allowed  for  each 
part 

Following  are  sample  items  and  accompanying  explanation  taken  from 
the  directions  from  the  test: 

The  ends  of  a  block  of  wood  1’  x  1*  x  3*  are  painted  black,  and  the  block  is 
then  cut  into  1-inch  cubes 
Answer  Legend 

A.  1 

B.  2 

C  3 

D.  4 

E.  6 

1.  How  many  1-inch  cubes  are  there? 

2.  How  many  1-inch  cubes  have  only  one  side  painted 
black  ? 

If  you  pictured  the  pieces  of  wood  correctly,  you  should  have  marked  “C*  for 
item  1  and  "B”  for  item  2,  since  the  3-inch  piece  can  be  cut  into  three  l-inch  cubes 
but  only,  the  two  end  cubes  would  have  painted  sides. 

(3)  Scoring. — The  scoring  formula  is  R— W/5+20. 

Statistical  results.— The  data  given  below  arc  for  examinees  tested 
at  Psychological  Research  Unit  No.  3. 

(1)  Distribution  statistics. — Distribution  statistics  for  two  overlapping 
samples  arc  given  in  tabic  12.2.  The  distribution  curves  arc  moderately 
positively  skewed  and  somewhat  flatter  than  normal. 


Table  12.2. —  Distribution  constants  for  Spatial  Visualisation  II,  CI203A,  based 
upon  samples  of  classified  pitots  in  class  44G 
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(2)  Internal  consistency.— The  degree  of  homogeneity  of  the  items 
is  indicated  by  a  mean  internal-consistency  phi  of  0.50,  a  standard  devi¬ 
ation  of  0.11,  and  a  range  of  values  from  0.19  to  0.67.  These  statistics 
arc  based  upon  analysis  of  the  responses  of  the  highest  27  percent  and 
the  lowest  27  percent  in  total  score  on  form  AX1  of  a  group  of  450 
unclassified  aviation  students,  tested  in  April  and  May  1943. 

(3)  Reliability  coefficient.— By  the  alternate- forms  method,  (part  I- 
part  II),  a  reliability  coefficient  of  0.84,  corrected  for  length,  was  ob¬ 
tained.  This  figure  is  based  on  a  sample  of  487  classified  pilots  in  class 
44G,  tested  in  January  1944. 

(4)  Correlation  betxveen  rights  and  urongs.— For  a  sample  of  500 
classified  students  (tested  in  1944;  specific  dates  of  testing  not  reported) 
the  correlation  between  rights  and  wrongs  was  -0.67.  For  594  naviga¬ 
tors  tested  in  May  1944,  the  correlation  was  -0.66. 

(5)  Difficulty. — Based  upon  item  analysis  of  the  responses  of  the 
above-mentioned  sample  of  450  unclassified  aviation  students,  form  AX1 
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yielded  a  mean  proportion  of  correct  responses  of  0.51,  corrected  for 
chance,  with  a  range  from  0.11  to  0.86  and  a  standard  deviation  of  0.19. 

(6)  Factorial  composition. — The  most  significant  loadings  are  in 
general-reasoning  (0.44),  visualization  (0.42),  reasoning  III  (0.36), 
and  reasoning  II  (0.35)  factors.  The  comnnmality  is  0.75.  For  a  full 
picture  of  the  factorial  composition  of  this  test  see  appendix  B. 

(7)  Test  validity. — Validation  results  based  on  several  samples  are 
given  in  tabic  12.3. 

(8)  Item  validity. — Validation  of  items  revealed  a  moan  phi  of  0.03 
based  upon  the  responses  of  600  graduates  and  127  eliminees  from  pri¬ 
mary  pilot  training  originally  tested  in  December  1943  and  January 
1944.  The  standard  deviation  of  phi  values  was  0.05,  and  the  range 
was  from  —0.08  to  0.15. 

Evaluation. — Spatial  Visualization  II  is  a  fairly  good  measure  of 
manipulatory  visualization;  but,  like  Pattern  Comprehension,  it  is  so 
complicated  by  reasoning  factors  that  it  is  of  little  value  in  predicting 
pilot  success.  Twenty-two  percent  of  its  total  variance  is  attributable  to 
visualization,  22  percent  to  reasoning  I,  14  percent  to  reasoning  II, 
and  15  percent  to  reasoning  III.  The  average  obtained  validity  of  0.17 
exceeds  slightly  that  to  be  expected  from  its  loadings  with  known  fac¬ 
tors  (0.12).  This  fact,  taken  together  with  the  difference  between  the 
reliability  and  the  communality  of  the  tests,  suggests  some  unknown 
source  of  validity.  The  very  large  proportion  of  reasoning  variance, 
however,  renders  this  test  unfit  for  use  in  pilot  selection.  , 

Because  of  its  combination  of  factors,  it  should  have  a  validity  of  at 
least  0.34  for  navigation  training.  The  limited  data  in  table  12.3  almost 
exactly  fulfill  this  expectation.  Because  it  is  a  complex  test,  however,  it 
should  not  be  included  in  a  battery  when  the  important  factors  are  al¬ 
ready  covered  by  purer  tests. 

Variations  of  the  test. — CJ203AX1  and  CI203A*  arc  identical,  ex¬ 
cept  for  certain  changes  in  the  directions.  In  the  A  form,  an  attempt 
was  made  to  add  face  validity  by  pointing  out  in  the  directions  how  the 
task  presented  in  the  test  is  related  to  flying  problems. 


Spatial  Visualization  I,  CI201AXI  * 

When  the  antecedent  of  this  test  (Paper  Folding)  was  devised  by 
Col.  J.  P.  Guilford  before  the  war,  it  was  set  up  to  be  a  measure  of 
visualization,  bn!  ’necause  of  high  intercorrelations  with  reasoning  tests, 
it  was  adapted  by  the  AAF  for  the  purpose  of  studying  nonverbal  rea¬ 
soning  ability.* 

Description.— I  or  each  item,  two  or  three  illustrations  show  step  by 
step  how  a  square,  circular,  or  triangular  piece  of  paper  is  folded  am! 


1  Diircliun*  if  Sft.  Niitin  KrivHi.  ,  ,  .  . 

•  al  Kcitinli  l  an  No.  J.  Ckiff  tanlnkulW*:  U.  Fr**fc  Daw* 

S/Sitt.  Waynr  S.  Zimmerman. 

»  <kap«»f  7. 
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finally  cut.  Blackened  areas  represent  portions  that  are  cut  out  after 
the  final  fold  has  been  made. 

The  examinee's  problem  is  to  determine  how  the  piece  of  paper  will 
look  when  it  is  unfolded.  To  the  right  of  the  illustration  of  the  folds 
are  five  representations  of  unfolded  figures.  One  of  the  five  unfolded 
figures  correctly  shows  alt  the  creases  made  by  folding  and  all  the  holes 
made  by  cutting. 

(1)  Internal  characteristics. — The  directions  contain  one  recorded 
but  unscorcd  practice  item.  Each  part  contains  30  scored  items.  Items  in 
part  I  are  made  up  of  square  pieces  of  paper  and  the  items  in  part  II 
arc  made  up  of  circular  and  triangular  pieces. 

(2)  Administration. — The  directions  consume  approximately  5  min¬ 
utes,  while  20  minutes  are  recommended  for  completion  of  the  items  in 
part  I  and  19  minutes  for  the  items  in  part  II,  making  a  total  testing 
time  of  approximately  44  minutes. 

In  figure  12.2  is  shown  the  sample  item  used  in  the  test. 


FIGURE  12.2 

SAMPLE  PROBLEM  OF  SPATIAL  VISUALIZATION  I, 

Cr204AXI 

(3)  Scoring. — The  scoring  formula  is  R— W/4. 

Statistical  results. — The  data  given  below  are  based  upon  samples 
tested  at  Psychological  Research  Unit  No.  3,  except  where  noted. 

(1)  Distribution  statistics. — Typical  examples  of  distribution  statis¬ 
tics  are  given  in  table  12.4.  The  distribution  curves  are  moderately  nega¬ 
tively  skewed  and  considerably  flatter  than  normal. 


Taclz  12.4. —  Distribution  constants  for  Sfatial  Visualisation  I,  CI204AX1 
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(2)  Internal  consistency. — The  degree  of  homogeneity  of  the  items 
is  indicated  by  a  mean  internal -consistency  phi  of  0.35,  a  standard  devi¬ 
ation  of  the  phi  distribution  of  0.14,  and  a  range  of  values  from  —0.12 
to  0.64.  These  statistics  are  based  upon  the  responses  of  the  highest  27 
percent  and  the  lowest  27  percent  in  total  score  of  a  group  of  450  un¬ 
classified  aviation  students  tested  in  August  1943. 
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(3)  Reliability  coefficient. — By  the  altemate-f^rms  method,  an  esti¬ 
mated  reliability  coefficient  of  0.84,  corrected  for  length,  waa  obtained. 
This  figure  is  based  on  a  sample  of  203  unclassified  aviation  students 
tested  in  April  1943. 

(4)  Correlation  between  rights  and  wrongs. — For  a  sample  of  735 
navigators  tested  in  February  1944  at  Ellington  Field  and  at  Psychologi¬ 
cal  Research  Unit  No.  3,  the  correlation  between  rights  and  wrongs 
was  —0.57. 

(5)  Difficulty. — I’ascd  upon  item  analysis  of  Ihe  responses  of  450 
unclassified  aviation  students,  the  t<  sl  yielded  a  mean  projiortion  of  cor¬ 
rect  responses  of  0.67,  corrected  for  chance,  with  a  range  from  0.04  to 
0.97  and  a  standard  deviation  of  0.34. 

(6)  Factorial  .  'imposition. — The  most  significant  leadings  arc  in  ti  e 
visualization  1 0.56),  reasoning  1  (0.34),  reasoning  II  (0.32),  space 
I  (0.24),  and  reasoning  III  (0.21)  factors  The  comim.nality  is  d.71. 
For  a  full  picture  of  the  factorial  c-\n|X>sition  of  this  test  see  appen¬ 
dix  P.. 

(7)  Test  validity. — Validation  results  based  on  several  samples  arc 
given  in  table  12.5.* 


Taiiix  12.S  —  Validity  Mila  for  Sficlial  Visualisation  I,  CtiOt,  using  the  graduation - 

elimination  criterion 
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1  ,\-M»».|i»ijf  an  unmiricird  ttaninr  MAiuUrd  deviation  of  2.00. 
*  fn  cTis*  44A. 


1  In  r|4«sr>  <iK,  4tG,  and  44lf. 

‘Toiril  at  i. Hi h* i on  l  iclti  and  Kcvareli  Unit  No.  2  in  FeWuafy  1944. 


Variations  -  -The  AX2  “  form  is  made  up  of  40  items  selected  on  the 
basis  of  internal-consistency  phis  from  the  bO-ilcin  AX1  form.  Items  arc 
arranged  in  order  of  increasing  difficulty,  winch  was  determined  from 
the  item  analysis.  Its  leading  factors  and  their  loadings  were  fouml  to  be 
visualization  (0.53),  general  reasoning  (0.39),  integration  III  (0.34), 
and  the  veiled  factor  (0.26).  Tlx-  ronimunalitv  was  0.69.  TWrc  is  no 
ready  explanation  for  the  discrepancies  between  the  two  forms  in  sec¬ 
ondary  factors. 

evaluation. — Spatial  Visualisation  I  proved  to  be  one  of  the  best 
measures  of  manipulatory  visualization,  although  significant  loadings  on 
three  reasoning  factors  ami  spare  I  indicate  a  complex  factorial  pat¬ 
tern.  l  orty-threc  percent  of  the  total  variance  is  attributable  to  the  visn- 
.iii/.i.'  inn  (actor,  lb  percent  l<>  grnei  il  iv.isn;iing.  ami  14  |>ercent  to  reason¬ 
ing  it  I  be  pilot  validity  to  l»c  ox|Mvled  iioin  u<  factorial  picture  is  0.1'\ 
and  :;•<  obtained  mean  validity  is  0.15.  Tbr  two  validities  probably  agree 
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wiiliiu  ilit*  limits  of  error,  The  rxjM'ctwl  validity  for  (lit:  AX2  form  is 
0.15  for  pilots,  and  tin*  obtained  validity  in-0.12,  The  ohlaiiu’d  validity  for 
navigator  selection  is  quite  high  (0.45).  Since  the  reasoning  I  and  visu¬ 
alization  Victors  account  for  only  half  of  this  validity,  and  protiably  less, 
the  lest  has  something  unique  and  substantial  to  offer  in  this  connection. 

This  lest  suggested  the  construction  of  the  promising  Spatial  Visual¬ 
ization  III  test,  an  orally  administered  test  of  the  ability  to  solve  paper- 
folding  problems. 

Visualization  of  Maneuvers,  CI657AX1  11 

This  test  was  constructed  as  a  possible  measure  of  both  visualisation 
and  integration. 

Description. — Each  item  begins  with  the  presentation  of  an  airplane 
in  a  given  attitude  shown  in  a  photograph.  Then  three  simple  maneuvers 
arc  stated ;  such  as,  turn  right  909,  nose  up  45°,  roll  left  90°.  Imagining 
himself  as  the  pilot  of  the  plane,  the  examinee  must  visualize  these 
maneuvers  in  sequence,  beginning  from  the  pictured  original  position. 
Then  he  must  select  the  correct  final  position  of  the  plane  from  five 
alternative  positions  shown  pictorially.  Thus,  the  examinee  must  keep 
in  mind  the  changed  position  of  the  plane  after  the  first  maneuver  and 
from  this  position  visualize  the  second  maneuver.  Again,  he  must  hold 
the  new  position  in  mind  and  from  this  second  jHisilion  vise  ’izc  the 
third  maneuver.  All  maneuvers  must  be  visualized  from  the  pilot’s  posi¬ 
tion  in  the  cockpit,  i.  e.,  turn  right  means  to  the  pilot's  right,  regardless 
of  the  plane's  position  on  the  page. 

(1)  Internal  characteristics. — The  directions  contain  two  recorded 
hut  unscorcd  sample  items.  Part  I  contains  28  scored  items,  and  part  11 
contains  30. 

(2)  Administration. — Twenty-five  minutes  arc  allowed  for  each  part 
of  the  test.  Directions  and  sample  items  consume  about  10  minutes, 
making  a  total  testing  time  of  00  minutes. 

One  sample  item  is  shown  in  figure  12.3.  Following  arc  parts  of  the 
directions: 

This  is  a  (cs:  of  your  ability  to  visualize  airplane  maneuvers.  In  each  problem 
tlx-  pilot  of  a  plane  will  lake  it  through  three  maneuver*.  On  the  left  is  shown 
ills-  starling  posiikxi  of  the  plane  and  on  lire  right  are  slwwn  five  positions,  one  of 
winch  is  the  final  position  of  the  piano  after  ibe  maneuvers  lave  been  executed. 

The  completion  of  the  third  maneuver  puls  the  plane  in  the  position  shown  in 
picture  C  C  is,  therefore,  the  correct  answer. 

(3)  Scoring. — The  scoring  formula  used  is  R— W/4. 

Statistical  results. — Except  wlw  re  specifically  noted  In  the  contrary, 
the  following  data  art  based  upon  samples  tested  at  Psychological  Re¬ 
search  Unit  No.  3. 

"  u  hytlwlnwid  KiMtrta  Uttzi  cwtfriWfer:  VSgl  W«/«a 

liKWtfMkM. 
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(1)  Distribution  statistics. — Typical  examples  of  distribution  statis¬ 
tics  arc  given  in  table  12.6.  The  distribution  curves  are  approximately 
symmetrical  and  considerably  Hatter  than  normal. 


Tabu;  12.6. — Distribution  constants  for  I  ‘isualisation  of  Maneuvers  based  upon 

samples  of  classified  pilots 


Form 

Number  of  items 

N 

it 

SD 

AXl  . 

5$ 

!|  1^2 

27  4 

14  I 

BXl*  . 

St 

*i!«5 

Si. 2 

ii.» 

CXI*  . 

1  loo  alaaa  iiL' 

98 

*1.190 

56  6 

16.* 

j ii  isr. 

'  For  descriptions  ot  these  forms  see  page  — . 

*  In  class  t4K. 

•  In  class  440. 


(2)  Interna!  consistency. — The  degree  of  homogeneity  of  the  items 
in  form  AX1  is  indicated  by  a  mean  internal-consistency  phi  of  0.52,  a 
standard  deviation  of  the  phi  distribution  of  0.10,  and  a  range  of  values 


Table  12.7. —  Alternate-forms  (fart  l  v.  part  II)  reliability  coefficients  for 
Visualisation  of  Maneuvers,  CI657 


Croup 

V 

Form 

''it 

r* 

Classified  pilots'  . 

1,619 

AXl 

>0.81 

0.89 

Classified  pilots' . 

521 

XXI 

-  *.;4 

.16 

Unclassified  ariatioa  students'  . . . 

191 

CXI 

'.82 

.90 

Unclassified  aeiation  students'  , . . 

525 

CXI  | 

'.85 

.92 

Unclassified  aviation  students'  and 
Airplane  Mechanics . 1 

44* 

CXI 

*.84 

.91 

Classified  pilots*  . 

504 

CXI 

.$7 

•  In  class  44K. 

•  I'art  II  administered  immediately  after  part  t, 

•  Tested  at  Medical  and  Psychological  Examining  Units  Not.  6  and  S  In  April  IMS. 

•  Part  II  administered  approximately  four  hours  after  part  I. 

•  Ir  class  44G. 

from  0.26  to  0.80.  These  statistics  arc  based  upon  analysis  of  the  re¬ 
sponses  of  the  highest  25  percent  and  the  lowest  25  percent  in  total  score 
of  a  group  of  800  classified  pilots  in  class  44F. 

(3)  Reliability  coefficient.— The  three  forms  yielded  the  alternate- 
forms  estimates  of  reliability  given  in  table  12.7. 

(4)  Correlation  between  rights  and  urongs. — Data  arc  presented  in 
table  12.8. 


Table  lit— Correlations  bftueen  rights  and  urongs  for  VisueJisation  of 

Maneuvers  ClbSJ 


Croup 

Form 

N 

AXl 

500 

HXI 

500 

CXI 

64  2 

CXI 

1,211 

-.1* 

-.11 


•Te.ted  in  1944;  specific  teUm*  dales  not  reported. 

•Te'i'ed  m*June,*IV4*4*’st,^>.yc>mtii«xsl  Rewsrrh  Unit  No.  I,  in  April 
Research  Usuis  Scs  1  snd  l 
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(5)  Difficult],'. — Based  upon  item  analysis  of  the  responses  of  the 
above-mentioned  sample  of  800  classified  pilots,  the  AXl  form  of  the 
test  yielded  a  mean  proportion  of  correct  responses  of  0.46,  corrected 
for  chance,  with  a  range  from  0.19  to  0.84  and  a  standard  deviation 
of  0.14. 

(6)  Test  validity. — Validation  results  for  the  three  forms  of  Visuali¬ 
zation  of  Maneuvers  are  given  in  table  12.9. 

(7)  Item  validity. — Validation  of  items  on  two  forms  of  this  test 
disclosed  the  results  recorded  in  table  12.10. 


Table  12.10.— Validity  of  items  of  Visualisation  of  Maneuvers  based  upon  samples 
of  pilots  in  primary  troinint j,  graduation-elimination  criterion 


Fora 

N. 

it* 

SD4 

Low 

Hick 

AXl  . 

>S4| 

0.M 

0.10 

0.00 

-0.0* 

0.2* 

AXl  . 

HU 

M 

.04 

.07 

-J07 

.21 

CXl  . 

•72J 

J) 

.04 

.07 

-.0* 

.IS 

*  In  cliu  44  F. 

*  la  (Un  440. 

Variations. — Form  BX1,1*  CXI,1*  and  CX2  differ  from  form  AXl 
only  in  the  number  of  maneuvers  called  for  between  the  initial  and  final 
positions  of  the  airplanes.  Form  BX1  presents  two  maneuvers;  both 
forms  CXI  and  CX2  present  one  maneuver.  These  other  forms  were 
developed  in  an  effort  to  lessen  the  difficulty  of  the  items  and  conse¬ 
quently  to  lay  more  dependence  upon  speed.  It  was  hypothesized  that 
good  pilots  excel  in  acts  that  are  undcliberatcd  lather  than  reasoned  and 
that  speeded  tests,  therefore,  arc  likely  to  show  more  valid  results.  Forms 
AXl  and  BX1  arc  of  the  same  length,  each  containing  58  scored  items 
divided  into  2  parts.  Form  CXI  contains  98  scored  items  divided  into  2 
parts,  while  CX2  contains  the  first  43  items  from  form  CXI  divided 
into  2  parts. 

Forms  AXl  and  BX1  correlate  more  highly  with  Mechanical  Princi¬ 
ples  and  less  highly  with  Complex  Coordination  than  form  C,  a  fact  that 
supports  an  original  hypothesis  advanced  during  the  construction  of  the 
test  that  the  more  complex  form  would  show  more  visualization,  the 
speeded  form,  more  spatial  content.  This  evidence  needs  the  further 
support  of  factor  analysis. 

Evaluation. — The  Visualization  of  Maneuvers  tests  proved  to  be  one 
of  the  most  valid  types  of  printed  tests  for  pilots  in  or  out  of  the  das- 
sification  battery.  It  would  be  an  excellent  selection  test  for  either  pilots 
or  navigators,  but  it  would  not  make  a  good  classification  test  in  which 
a  discriminating  function  is  desired. 

“Drtritped  tl  P»rck*l*#ktl  Hrwiti  U»*  X*.  J.  CV.f  J/Sjl  Wtj-M  f- 

■  IVtittoprd  tl  P«jrcko!»fU»!  Rortrtfc  lltil  X*.  J.  Chief  (tnlrlWltli:  C*K  Shttrt  W.  Cut 
tad  S/Sft_  Wtrat  1  ZtMOTM*. 
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No  factorial  data  are  available  at  the  time  of  this  writing,  but  inter- 
correlations  suggest  that  the  AX1  form  is  comparatively  complex  and 
would,  therefore,  be  used  more  appropriately  in  an  omnibus  test  like 
the  AAF  Qualifying  Examination.  The  level  of  validity  for  ho*h  pilot 
and  navigator  also  assures  us  that  the  test  is  faetorially  complex  and 
combines  factors  that  arc  strongly  valid  for  both  specialties.  For  pilot 
validity,  both  space  and  visualization  must  surely  be  present  in  large 
amounts.  For  navigator  validity  some  reasoning  variance  must  also  be 
present 

Formation  Visualization,  CP814A  14 

This  test  was  developed  in  an  effort  to  measure  the  examinee’s  ability 
to  visualize  in  three  dimensions.  If  the  views  of  airplanes  in  formation 
are  shown  from  two  directions  at  right  angles  to  each  other,  it  is  pos¬ 
sible  by  visualization  to  determine  the  appearance  of  the  same  formation 
from  the  third  orthogonal  direction.  Since  this  type  of  item  presents  a 
rather  difficult  visual  manipulative  probtem,  only  a  limited  number  of 
airplanes  can  be  presented  in  each  formation.  It  is  known  that  overly 
difficult  visualization  items  are  likely  to  be  reasoned  through  to  a  solu¬ 
tion.  In  order  to  keep  the  difficulty  level  as  low  as  practicable,  only  two 
or  three  airplanes  were  included  in  any  single  formation.  The  use  of 
airplanes  as  the  objects  to  be  visualized  adds  face  validity  to  the  test. 

Description. — Each  item  shows  in  silhouette  a  top  view  and  a  side 
view  of  a  formation  of  either  two  or  three  airplanes.  The  examinee’s 
problem  is  to  visualize  the  appearance  of  the  same  formation  from  the 
front  view.  Four  alternative  front  views  are  presented  with  each  item, 
one  of  which  is  correct 


TOP 

SIDE 

SAMPLE 

PROBLEM 

_ 1 

FRC 

>NT 

8 

c 

0  -~K  - 

FIGURE  12.4 

SAMPLE  PROBLEM  OF  FORMATtON  VISUALIZATION, 

CP0I4A 

*t  XtMtrek  Unit  S*.  J.  Cbiel  contributor*:  CpI.  Albert  A.  CuUrM 

1«U  Uti.  fiWrt  L  TWroiik*. 
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(1)  Internal  characteristics. — The  directions  contain  one  recorded, 
but  unscored,  sample  item.  The  test  contains  49  scored  items.  The  first 
19  items  are  composed  of  formations  of  2  airplanes,  while  all  of  the 
remaining  items  are  made  up  of  formations  of  3  airplanes. 

(2)  Administration. — Administration  of  the  .directions  takes  2  min¬ 
utes,  and  IS  minutes  are  allowed  to  complelTTne  test  items,  nuking  a 
total  testing  time  of  17  minutes. 

The  sample  problem  from  form  CP814A  is  shown  in  figure  12.4. 
Following  arc  parts  of  the  directions : 

This  is  a  test  of  your  ability  to  visualise  plane  formations.  In  each  problem  yon 
will  see  two  views  of  a  formation  of  either  two  or  three  planes.  One  view  will  show 
the  formation  as  seen  from  above ;  the  other,  the  formation  as  seen  from  the  side. 
Your  task  is  to  visualize  how  this  formation  would  appear  if  it  were  seen  from  the 
front 

Below  each  formation  there  are  four  front- view  diagrams.  A,  B,  Q  and  D,  only 
one  of  which  correctly  represents  the  formation  as  it  would  appear  from  the  front 
These  diagrams  arc  not  drawn  to  scale.  Remember  that  only  one  of  the  four  dia¬ 
grams  in  each  item  represents  the  front  view  of  the  formation. 

Which  diagram  correctly  represents  the  front  view  of  the  formation  in  the 
sample  problem? 

As  only  one  plane  can  be  seen  in  the  side  view,  the  other  plane  must  be  con¬ 
cealed  directly  behind  it.  Therefore,  the  two  planes  in  this  formation  should  be 
visualized  as  flying  side  by  side  at  the  same  altitude. 

The  correct  front  view  of  this  formation  is  shown  by  diagram  A. 

(3)  Scoring. — The  scoring  formula  used  is  R— W/34-20. 

Statistical  results. — The  data  given  below  are  for  samples  tested  at 

Psychological  Research  Unit  No.  3  in  September  and  October  1944. 

(1)  Internal  consistency. — The  degree  of  homogeneity  of  the  items 
is  indicated  by  a  mean  internal-consistency  phi  of  0.36,  a  standard  devi¬ 
ation  of  the  phi  distribution  of  0.12,  and  a  range  of  values  from  0.13  to 
0.66.  These  statistics  arc  based  upon  analysis  of  the  responses  of  the 
highest  27  percent  and  the  lowest  27  percent  in  total  score  of  a  group  of 
1,500  unclassified  aviation  students. 

(2)  Difficulty. — Based  upon  the  responses  of  750  unclassified  avia¬ 
tion  students,  the  test  yielded  a  mean  proportion  of  cor'cct  responses 
of  0.57,  corrected  for  chance,  with  a  range  from  0.14  to  0.94,  and  a 
standard  deviation  of  0  23. 

Etvluation. — No  further  data  are  available  upon  which  conclusions 
can  be  based  regarding  the  nature  of  this  test.  Both  the  internal  con¬ 
sistency  and  difficulty  levels  are  satisfactory. 

Crawford-Brnnelt  Point  Motion  Test  “ 

Imagining  the  motion  of  machinery  and  following  these  ntotions  men¬ 
tally  has  been  described  as  a  demonstration  of  the  ability  to  visualize. 

Description. — Kach  item  presents  an  assembly  drawing  of  certain 
parts  of  a  machine  The  path  that  a  single  point  on  the  machine  will 

**  N'i'iiIiH  kr  iti«  S’»> < *1  V«k,  K.  V. 
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fellow  when  the  mechanism  is  set  in  motion  is  indicated.  A  second  point 
in  another  section  of  the  machine  is  also  marked,  but  the  path  for  this 
point  is  not  shown.  It  is  the  examinee's  problem  to  determine  the  exact 
path  that  the  second  point  will  follow  when  the  mechanism  is  in  opera¬ 
tion.  The  correct  path  must  be  selected  from  four  illustrated  choices. 

(1)  Internal  characteristics.—' The  test  contains  1  unrecorded  and 
unscorcd  sample  item  in  the  directions  and  30  scored  items  in  the  body 
of  the  test. 

(2)  Administration. — When  this  test  was  first  administered  to  pros¬ 
pective  air-crew  members,  25  minutes  were  allowed  to  complete  the 
items.  The  testing  period  was  found  to  be  unnecessarily  long  and  was 
later  reduced  to  15  minutes. 

The  sample  item  is  shown  in  figure  12.5.  Following  are  excerpts  from 
the  directions: 

How  will  point  X  move  when  point  P  moves  as  shown  by  the  arrows?  Choose 
your  answer  from  A,  B,  C,  or  D. 

B  is  the  correct  answer,  since  the  curve  B  best  describes  the  motion  of  point  X. 


SAMPLE  PROBLEM  OF  CRAWFORD-BENNETT  POINT  MOTION, 

FORM  B 

(3)  Scoring. — The  scoring  formula  is  R— W/3. 

Statistical  results. — This  test  was  administered  at  Psychological  Re¬ 
search  Unit  No.  3. 

(1)  Internal  consistency. — The  degree  of  homogeneity  of  the  items 
is  indicated  by  a  mean  internal-consistency  phi  of  0.38,  a  standard  devi¬ 
ation  of  0.18,  and  a  range  from  0.02  to  0.69.  These  statistics  are  based 
upon  analysis  of  the  responses  of  highest  27  percent  and  the  lowest  27 
percent  of  740  unclassified  avation  students  tested  in  June  1944. 

(2)  Difficulty. — Based  upon  the  responses  of  the  above-mentioned 
sample  of  740  unclassified  aviation  students,  the  test  yielded  a  mean 
proportion  of  correct  responses  of  0.48,  corrected  for  chance,  w'ith  a 
range  from  0.03  to  0.94  and  a  standard  deviation  of  0.26. 
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(3)  Test  validity. — Validation  results  arc  given  in  table  12.11. 


Tablz  12.11. —  Validity  data  for  the  Craxvford-Bennetl  Point  Motion  Test,  form  B, 
based  upon  elimination  from  primary  training  [N, —973;  p,—0A0) 


Score 

M, 

SD, 

.'♦is* 

Rights*  . 

17  At 

16.62 

1.91 

0.02 

0.20 

Wrongs*  . 

12.41 

11.16 

4.06 

-.10 

-.22 

R-W/l  . 

12.98 

12.21 

5.19 

.08 

.20 

1  Assuming  an  unre'trictcd  stanine  standard  deviation  el  2.00. 

*  Fat  this  sample,  the  correlation  between  rights  and  wrongs  is  —0.91. 


Evaluation. — A  validity  of  —0.22  computed  on  the  wrong  scores  and 
of  0.20  computed  on  the  right  scores  indicates  that  this  test  does  have 
some  pilot  validity,  as  was  predicted.  The  validity  was  not  high  enough, 
however,  to  add  significantly  to  the  prediction  efficiency  of  the  classifi¬ 
cation  battery  in  view  of  its  high  correlation  with  the  pilot  stanine. 

Spatial  Visualization  III,  CP53S  “ 

Following  a  period  of  concentrated  factor  study,  a  program  was  out¬ 
lined  for  the  development  of  tests  to  measure  the  known  factors  of 
intellectual  ability  in  as  pure  a  fashion  as  possible.  This  test  is  one  of 
a  group  of  tests  designed  in  an  attempt  to  secure  a  pure  measure  of  the 
visualization  factor. 

The  paper-folding  test,  Spatial  Visualization  I,  proved  to  be  one  of 
the  best  available  measures  of  the  factor,  although  it  was  more  highly 
saturated  with  reasoning  than  was  desired.  It  was  hypothesized  that  (1) 
the  reasoning  content  of  Spatial  Visualization  I  is  primarily  due  to  the 
opportunity  afforded  in  the  pictorial  presentation  to  note  relationships 
and  to  derive  systems  for  obtaining  answers  wjthout  depending  upon 
visualizing  powers,  and  (2)  a  verbal  presentation  would  reduce  the  op¬ 
portunity  to  solve  the  problems  by  any  method  other  than  visualization. 

Description. — From  an  orally  delivered  description,  each  item  re¬ 
quires  the  subject  to  visualize  the  folding  of  a  square  sheet  of  paper  into 
various  shapes.  The  final  correct  shape  must  be  selected  from  drawings 
presented  in  the  test  booklet. 

(I)  Administration. — The  oral  descriptions  are  presented  by  means 
of  phonograph  records.  The  examinees  listen  to  the  recorded  test  items 
with  their  booklets  closed.  When  the  description  of  the  paper  folding 
is  completed,  the  examinee  is  instructed  to  open  his  booklet.  He  is  then 
told  the  number  of  the  item  in  the  booklet,  after  which  the  correct  choice 
can  be  found  among  five  alternative  illustrations.  Ten  seconds  are  al¬ 
lowed  for  locating  and  recording  each  answer.  Then  the  examinee  is  in¬ 
structed  to  close  his  test  booklet  before  the  next  problem  is  presented. 

The  five  alternative  illustrations  for  the  sample  item  are  shown  in 
figure  12.6.  Following  are  excerpts  from  the  directions: 

«•  Develop*!  at  Psycholoitical  Rotartb  Unit  No.  1.  Cbirf  contributor!!  T/Sgt  GofftM  H. 
Shirley  and  S/Sgt.  Wayne  S.  Zimmerman. 
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This  is  a  test  to  see  how  well  you  can  visualize.  In  every  problem  you  are  to 
imagine  folding  a  square  sheet  of  paper  into  various  shapes.  Since  the  directions 
cannot  be  repeated,  you  must  listen  very  closely  and  follow  each  move  as  it  is  given. 

Listen  to  sample  problem  one. 

Imagine  a  square  sheet  of  paper.  Now  imagine  folding  it  in  the  middle  fiom 
top  to  bottom.  Now  fold  the  upper  left  corner  to  the  middle  of  the  lower  edge. 


A  B 


C  D  E 


FIGURE  12.6 

ALTERNATIVE  ANSWERS  OF  THE  SAMPLE  PROBLEM. 
SPATIAL  VISUALIZATION  HI.  CPI08A 


Turn  to  page  two,  number  eight  in  your  answer  booklet;  page  two,  number  eight. 

Look  at  the  five  alternatives  listed.  Which  alternative  looks  like  the  paper  after  it 
has  been  folded?  Choice  D  is  correct.  Blacken  in  space  D  after  number  one  on 
your  answer  sheet. 

Close  your  booklet. 

(2)  Scoring. — The  scoring  formula  is  R— W/4. 

Evaluation. — No  data  are  available. 

Position  Visualization,  CP534A  1T 

This  test  is  one  of  a  group  of  tests  designed  for  the  specific  purpose 
of  obtaining  a  pure  measure  of  manipulatory  visualization.  It  was  hy¬ 
pothesized  that  visual  aids,  such  as  drawings  or  illustrations  of  the  objects 
to  be  visually  manipulated,  reduce  the  amount  of  visualizing  required 
to  solve  the  problems.  Several  tests  were  designed,  therefore,  with  items 
presented  in  simple  verbal  terms  so  that  the  objects  to  be  manipulated 
would  have  to  be  visualized  without  the  help  of  visual  cues.  Due  to  the 
comparative  ease  with  which  its  various  positions  could  be  described 
verbally,  the  United  States  flag  was  selected  as  the  object  to  be  visually 
manipulated. 

Description. — Each  item  requires  that  a  flag  be  visualized  in  a  certain 
position.  From  the  initial  position  the  flag  is  to  be  both  rotated  and 
turned  over  according  to  specified  directions.  The  examinee  must  vis¬ 
ualize  these  manipulations  and  record  the  final  position  of  the  flag  in 
terms  of  the  location  of  the  stars  and  the  direction  of  the  stripes. 

(1)  Internal  characteristics. — The  directions  contain  three  recorded, 
hut  unscored,  sample  items.  Part  I  contains  27  scored  items,  and  part 
II  contains  25  scored  items.  Items  are  printed  verbally  and  are  presented 
in  tabular  form. 

*’  Developed  at  INyi'holoKical  Research  Unit  No.  J.  Chief  contributors:  S/Sg*.  Btnjamin 
Kiuchter  and  l.t.  John  W.  Howe,  Jr. 
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(2)  Administration. — Two  minutes  and  45  seconds  arc  allowed  Jo 
work  the  sample  items.  Twelve  minutes  arc  allowed  for  part  I,  and  10 
minutes  for  part  II.  Reading  directions  takes  about  5  minutes,  making 
approximately  .30  mintues  total  testing  time. 

Following  are  parts  of  the  directions: 

This  is  a  test  of  your  ability  to  imagine  an  object  as  it  is  moved  from  one  posi¬ 
tion  to  another. 

The  object  to  be  imagined  is  the  American  flag.  At  the  start  of  each  problem 
the  position  of  the  stars  and  stripes  will  be  g'wn;  for  example,  stripes-horiiontal; 
stars-upper  left.  Then  you  will  be  instructed  to  imagine  moving  the  (lag  in  certain 
definite  ways,  and  in  your  answer  describe  the  final  position  of  the  stars  and 
stripes. 

The  flag  will  be  moved  only  in  the  following  ways: 

The  flag  will  be  turned  over  the  long  way  in  some  items  and  the  short  way  in 
others.  To  turn  over  the  flag  simply  means  to  switch  the  surfaces,  as  when  you 
address  an  envelope  and  turn  it  over  to  seal  it.  hong  way  means  the  flag  should  be 
turned  over  lengthwise,  as  in  illustration  A.  (See  fig.  12.7.)  Sliort  way  means  it 
should  be  turned  over  crosswise,  as  in  illustration  II, 
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FIGURE  12.7 

ILLUSTRATIONS  OF  FLAG  TURNING  AND  FLAG  ROTATING 
IN  THE  INSTRUCTIONS  OF  POSITION  VISUALIZATION  I, 

CP534A 
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The  flag  will  be  rotated  90*  to  the  right  in  some  items  and  90*  to  the  left 
m  others.  To  the  right  means  clockwise;  to  the  left  means  counterclockwise.  90* 
means  one-quarter  turn. 


The  first  practice  problem  is  reproduced  below : 


Item 

Sterling  petition  j 

Step  I 

Stripe* 

Star* 

Step  II 

1. 

Horizontal  . 

Ujjper  left . 

Rotate  90*  to  the  right 

Turn  ever  long  way. 

See  the  step-by-step  solution  to  Practice  Item  1  illustrated  in  figure  12.7. 


A 
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alternative  answers  for  the  sample  items 

POSITION  VISUALIZATION  ULt  CPIIIA 
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FIGURE  12.8 


To  describe  the  position  of  the  stars  and  stripes  on  your  separate  answer  sheet, 
blacken  the  space  under  ‘V’  opposite  question  I,  because  in  the  final  position  the 
stars  and  stripes  are  vertical.  Blacken  the  space  under  T.R*  because  in  the  final 
position  the  stars  arc  in  the  lower-right  comer.  Every  answer  will  have  two  parts; 
one  to  show  the  vertical  or  horizontal  position  of  the  stripes,  and  one  to  show  the 
position  of  the  stars. 

Statistical  results. — None  arc  available. 

Position  Visualization  IX,  CPIIIA1* 

This  is  another  test  designed  for  the  express  purpose  of  obtaining  a 
pure  measure  of  manipulatory  visualization. 

Description. — Each  item  requires  the  subject  to  visualize,  in  response 
to  an  orally  delivered  description,  four  objects  (disks)  forming  a 
square.  Then  certain  disks  are  to  be  imagined  moved  to  different  posi¬ 
tions.  The  final  pattern  must  be  visualized,  so  that  it  can  be  correctly 
selected  from  patterns  presented  in  the  test  booklet 

■Developed  D  RrrtheloricaJ  Rcwerch  Unit  N*.  J.  Chief  MoirihiUn!  CfC  Allan  L.  ItrM 

T/Sr-  G*t»U  K.  SklAtr. 
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(1)  Internal  characteristics. — The  direction*  contain  one  unrecorded 
item  and  one  recorded,  but  unscored,  sample  *tein.  Pares  I  and  II  each 
contain  IS  scored  items. 

(2)  Administration. — Some  of  the  direct  ans  and  all  of  the  problems 
are  presented  orally  by  phonograph  record.  The  test  booklet  is  closed 
during  the  reading  of  each  problem.  Dircccions  to  open  the  booklet  and 
to  look  for  the  correct  pattern  following  a  specified  number  on  a  certain 
page  follow  the  description  of  each  problem.  Directions  to  dose  the 
booklet  precede  each  new  problem.  Administrative  directions  take  ap* 
proximatcly  5  minutes.  Five  minutes  and  50  seconds  are  allowed  for 
each  part,  making  a  total  testing  time  of  17  minutes. 

Following  are  parti  of  the  direction*; 

In  this  test  you  are  to  imagine  four  disks  forming  a  square 

First,  imagine  moving  the  lower  left  disks  above  the  upper  right  disks.  Now, 
imagine  moving  the  bottom  disk  to  die  left  of  the  leftmost  disk. 

Next,  look  at  the  five  patterns,  (see  fig.  12.8.)  Which  is  the  correct  final  position 
of  the  disks? 

Choice  "D"  is  the  correct  answer. 


FIGURE  12.9 

SAMPLE  ITEMS  Or  FLIGHT  PATH, 

CPtOSA 

(3)  Scoring. — The  scoring  formula  is  R— W/4. 

Statistical  results.— None  are  available. 

Evaluation  of  the  Subarea  of  Visual  Manipulation 

The  evidence  presented  in  numerous  factor  studies  for  the  existence 
of  an  independent  visualization  factor  ,  is  substantial.  All  of  the  tests 
heavily  saturated  with  the  factor  seem  to  involve  a  visual  ntampulative 
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ability.  In  solving  the  problems,  it  is  necessary  mentally  to  move,  turn, 
twist,  or  rotate  an  object  or  objects  and  to  recognize  a  new  appearance 
or  position  after  the  prescribed  manipulation  is  performed. 

Although  it  has  been  particularly  difficult  to  construct  pure  measures 
of  the  ability,  due  to  the  ever  present  contamination  with  reasoning, 
there  is  little  doubt  that  visualizing  of  the  type  described  is  involved. 

Some  progress  has  been  made  toward  the  understanding  of  this  fac¬ 
tor,  but  there  is  still  much  to  be  learned.  No  visualization  factor,  as 
such,  had  been  extracted  by  previous  investigators,  although  the  possi¬ 
bility  of  its  existence  has  been  recognized  for  many  years.  The  new 
tests,  Position  Visualization  I  and  II  and  Spatial  Visualization  III,  give 
promise  of  further  defining  and  clarifying  the  concept. 

The  best  estimate  of  the  validity  of  the  factor  for  pilots  is  0.20, 
based  upon  all  available  results  (sec  table  28.17).  Estimates  of  validities 
for  other  air-crew  specialties  are  0.06  for  navigators  and  0.20  for  bom¬ 
bardiers.  Any  test  having  a  loading  in  the  factor  as  high  as  0.70  would 
thereby  contribute  0.140  to  pilot  validity,  0.042  to  navigator  validity,  and 
0.140  to  bombardier  validity  by  reason  of  this  factor  alone. 


VISUAL  COMPLETION  TESTS 


Rationale  of  Visual  Completion  Tests 

The  typical  test  in  this  group  calls  for  an  ability  to  visualize  the  com¬ 
pletion  of  a  design  or  the  extrapolation  of  a  line  or  a  path.  This  is 
merely  another  occasion  for  the  visualizing  abilities  such  as  a  pilot,  par¬ 
ticularly,  seems  required  to  bring  to  bear  upon  his  job,  as  for  example, 
he  forecasts  his  own  position  and  the  positions  of  other  airplanes,  friend 
or  enemy,  perhaps  in  the  next  split  second. 


Flight  Path,  CP105A  »• 


A  pilot  must  be  able  to  determine  accurately  his  projected  flight  path. 
He  must  be  able  to  judge  beforehand  relative  positions  of  his  plane  and 
reference  points  along  the  planned  course  of  flight.  This  test  was  an 
outgrowth  of  an  earlier  attempt  known  as  the  Line  of  Flight  Test, 
CP102A,**  which  was  abandoned  because  a  single,  uncontested  extra¬ 
polation  of  the  suggested  irregular  curves  could  not  be  determined. 

Description. — In  each  item  of  the  test  the  examinee  must  extrapolate 
an  arc  as  he  visualizes  a  plane  in  flight  completing  a  circle.  Only  a  part 
of  the  circular  course  of  each  plane  is  shown.  Seeing  only  a  part  of  the 
circle,  the  examinee  must  decide  through  which  of  several  points  the 
plane  would  pass  if  it  continues  along  the  same  curve. 

(1)  Internal  characteristics. — The  directions  include  a  page  of  five 
recorded,  but  unscorcd,  sample  items.  Parts  I  and  II  each  contain  3 
pages  of  10  items  each,  making  a  total  of  30  scored  items  per  part.  Five 
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planes  and  their  partial  paths  arc  illustrated  on  each  page,  with  10  let¬ 
tered  points  through  which  the  planes  might  pass, 

(2)  Administration. — Administration  of  directions  and  sample  items 
consumes  approximately  10  minutes,  and  10  minutes  arc  allowed  for 
each  of  the  2  parts,  making  a  total  testing  time  of  30  minutes. 

The  sample  items  are  shown  in  figure  12.9.  Following  are  parts  of 
the  directions: 


Each  lettered  plane  is  traveling  a  different  circular  path;  that  is  it  will  make  a 
complete  circle.  Your  task  is  to  decide  which  plane  will  pass  through  each  numbered 
point  The  points,  which  correspond  to  item  numbers  on  your  answer  sheet,  are 
numbered  on  the  page  from  left  to  right  (in  the  sample  from  101  to  UO). 

If  plane  A  followed  the  arc,  it  would  go  through  point  101.  Therefore,  the  an¬ 
swer  to  item  101  is  A. 

Each  point  has  one  and  only  one  plane  which  will  pass  through  it.  Each  plane 
will  pass  through  one  or  more  points. 

Statistical  results. — This  test  was  administered  at  Psychological  Re¬ 
search  Unit  No.  3. 

(1)  Distribution  statistics. — Typical  examples  of  distribution  statis¬ 
tics  are  given  in  table  12.12.  The  distribution  curves  are  approximately 
symmetrical  and  considerably  flatter  than  normal. 


Table  12.12. —  Distribution  constants  for  Flight  Path,  CPI05A 


Croup 


Classified  pilots* . 

Classified  pilots' . 

Unclassified  aviation  students* 
Unclassified  aviation  students* 
Unclassified  aviation  students* 
Unclassified  aviation  students' 
Unclassified  aviation  students* 
Unclassified  avialion  students* 


ft  If  fits 

Wrongs 

Rights 

Rights 

Rights 

Wrongs 

Wroncs 

Wrongs 


formula 

Part 

N 

M 

SO 

1.1)2 

1.1)2 

29.9 

10.0 

1  and  11  .. 

24.0 

M 

I  . 

500 

I4.fi 

J-4 

||  . 

500 

11.4 

At 

I  and  II  .. 

500 

10.0 

to.) 

|  . 

500 

II.) 

4.0 

II  . 

500 

12.4 

fi.fi 

no  .  on.  . 

1  and  It  . 

500 

2D 

tOJ 

*  In  -i»«t  44 L 
'Tested  in  May  1944. 


(2)  Reliability  coefficient.— For  right  and  wrong  scores  separately, 
one  sample  yielded  the  estimates  of  reliability  given  in  table  12.13. 


Table  12.13. —  Estimated  reliability  coefficients  for  Flight  Path,  CII05A,  based 
upon  a  sample  of  500  unclassified  aa-iation  students1  alternate- forms  procedure 


Scoriae  formula 

0.44 

0.41 

.SO 

M 

*  Tested  in  May  IML 


(3)  Correlation  betuven  rights  and  wrongs. — The  data  arc  pre¬ 
sented  in  table  12.14. 

Table  12.14  —  Correlation  bettveen  righU  and  wrongs  for  Flight  Path  CP  105 A 


Group 

X 

*•» 

500 

-4)0 

12) 

-44 

•Trsted  May  1*44., 

•  Tested  Mar  *  *•  /*•*  **»  ****• 
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(4)  Test  validity. — Validation  results  based  on  two  samples  are 
given  in  table  12.15. 


Table  12.15.—  Validity  data  for  Flight  Path,  CP105A,  based  upon  graduation 
elimination  of  pilots  in  primary  training 


Scoring  formula 

SD, 

r»u 

.rM.* 

. 

M.JJ2 

0.86 

10.07 

28.64 

10.04 

0.08 

0.14 

Wrong*  . 

*1.332 

.86 

21.94 

24.51 

9.78 

-.01 

—.09 

Rixbtt  . . 

*521 

.7$ 

12.29 

10.11 

10.48 

.12 

A6 

Wron*»  . 

•523 

.75 

21.46 

24.58 

9.18 

-.20 

~.2i 

R-W/4  . 

1  A  ............  --  . 

*521 

.75 

26.96 

21.62 

12.08 

.14 

.21 

1  an  unrestricted  iunine  standard  deviation  of  2.00, 

'In  data  44L 

'  Tested  May  9  to  July  10.  1944. 


(5)  Item  validity. — Validation  of  items  of  this  test  disclosed  the  re¬ 
sults  recorded  in  table  12.16. 

Table  12.16. —  Validity  of  items  of  Flight  Path,  CP105A,  based  upon  samples  of 


pilots  in  primary  training 


Range  of  # 

UP 

SD# 

Low 

Hijrh 

*714 

0.82 

0.025 

0.05 

-0.12 

0.14 

*799 

46 

.004 

.04 

-.07 

.11 

'!•  class  44  HL 
•  la  class  44L 


Evaluation. — Flight  Path  failed  to  exhibit  promising  validity  for 
pilots.  Chief  interest  in  the  test  lies  in  the  question  of  whether  or  not 
it  will  measure  some  new  factor  or  factors  not  hitherto  defined.  A  low 
communality  with  existent  tests  indicates  promise  in  this  direction.  Be¬ 
fore  its  factorial  content,  which  appears  to  be  largely  specific  in  combina¬ 
tion  with  presently  available  tests,  cim  be  revealed,  other  tests  with 
similar  content  will  have  to  be  introduced  into  the  correlational  matrix 
to  be  analyzed.  Low  validity  for  pilots  is  not  absolute  proof  that  it  is 
not  measuring  to  a  small  extent  the  same  visualization  factor  that,  ma¬ 
nipulation  tests  have  in  common.  Should  it  prove  to  be  deficient  in  this 
factor,  however,  we  have  evidence  for  the  hypothesis  that  the  visualiza¬ 
tion  factor  is  of  a  special  variety,  perhaps  confined  to  manipulation  test3. 

Of  all  tests  mentioned  in  this  diopter,  this  one  comes  nearest  from 
the  test-constructor’s  viewpoint  to  satisfying  the  oft-mentioned  ability 
"visualization  of  the  flight  course."  From  that  aspect  it  has  good  face 
validity. 

Evaluation  for  the  Subarea  of  Visual  Completion 

Since  Flight  Path  is  the  only  test  studied  in  this  area,  there  is  little 
evidence  from  which  to  draw  conclusions.  Flight  Path  itself  has  no  cor¬ 
relation  of  more  than  0.23  with  any  test  on  which  sufficient  data  are 
available  for  analy.is.  Such  low  correlations  suggest  that  its  common 
variance  would  be  so  low  that  it  would  fail  to  appear  significantly 
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projected  on  any  known  factor.  For  that  reason  and  because  no  appar* 
ently  similar  tests  are  known,  it  has  not  been  used  for  analysis  in  factot 
studies.  On  an  a  priori  basis,  it  would  be  easier  to  rationalize  the  pres¬ 
ence  of  a  perceptual  factor,  a  distance-estimation  factor,  or  a  resistance- 
to- illusion  factor  than  to  explain  why  the  known  visualization  factor 
should  show  saturation  in  the  test.  Obviously,  (light  path  contains  a  sub¬ 
stantial  variance  that  is  unique,  as  far  as  known  tests  are  concerned. 

EVALUATION  OF  VISUALIZATION  TESTS 

The  questions  quoted  in  the  introduction  to  this  chapter  can  be  an¬ 
swered  more  satisfactorily  now  than  before  the  present  work  was  be¬ 
gun.  In  answering  these  questions,  new  problems  have  arisen  which,  it 
is  hoped,  will  promote  further  research  in  the  area. 

The  first  question,  “Is  visualizing  ‘fiat*  forms  the  same  as  visualis¬ 
ing  solid  forms  as  they  would  appear  from  different  sides?"  is  appar¬ 
ently  answered  satisfactorily.  Thurstone  admits  of  entertaining  the 
hypothesis  that  two  and  three-dimensional  spatial  thinking  might  appear 
as  tv/o  separate  abilities,  until  the  emergence  of  a  single  spatial-visual 
axis  in  his  subsequent  analysis  denied  the  probability.  Problems  requir¬ 
ing  the  examinee  to  rotate  (lags,  figures,  cards,  and  lozenges  in  a  flat 
plane  appeared  with  factor  patterns  simitar  to  those  of  problems  in¬ 
volving  three-dimensional  manipulations,  such  as  switching  the  surfaces 
of  the  objects  visualized.  Further  evidence  is  presented  in  the  data  of 
this  chapter  to  indicate  that  the  important  feature  of  visualization  is  not 
whether  one,  two,  or  three-dimensional  movement  of  the  visual  image 
is  concerned,  but  whether  movement  of  any  sort  takes  place.  In  the 
concept  of  visual  manipulation  a  movement  of  some  kind  seems  essential. 

The  second  question  referred  to  in  the  beginning  of  this  chapter,  "Is 
visualizing  solid  objects  the  same  as  visualizing  movement  of  parts  in  a 
diagram  of  a  machine?"  can  be  answered  with  a  positive  “yes."  In  the 
aviation-psychology  program,  the  best  known  measures  of  visualization 
are  Spatial  Visualization  I  (paper  foiling),  Mechanical  Principles,  and 
Spatial  Visualization  II. 

Question  number  three  was,  "Considering  only  flat  forms,  is  the  same 
ability  required  in  visualizing  several  shapes  singly  as  in  visualizing 
how  these  shapes  could  be  fitted  together?"  This  question  is  not  an¬ 
swered.  According  to  the  expressed  description  of  manipulatory  visuali¬ 
zation,  some  movement  is  required,  and  visualizing  several  shapes  singly 
seemingly  requires  none.  By  this  token,  only  the  manipulations  involved 
in  imagining  how  the  shapes  could  be  fitted  together  could  be  truly  vis¬ 
ualization  of  this  sort.  Then  what  is  the  ability  to  visualize  these  shapes 
singly  ?  Since  no  test  has  been  analyzed  that  can  claim  to  measure  such 
an  ability,  only  preliminary  hypotheses  can  be  offered. 

If  the  forms  or  details  visualized  are  familiar,  then  possibly  the  abil¬ 
ity  involved  is  one  of  pure  recall,  and  the  visual-memory  factor  already 


295 


I 


reported  will  account  for  the  variance.  If,  on  the  other  hand,  these 
shapes  must  be  constructed  or  created  in  the  "mind’s  eye,”  an  entirely 
different  ability  may  he  needed.  It  is  also  possible  that  vi-ut.d  construc¬ 
tion  or  completion  might  involve  manipulations,  insofar  a<  each  mrt  or 
detail  is  "moved”  into  the  visual  picture  and  relegated  to  its  prope  r  posi¬ 
tion.  If  so,  the  ability  could  be  explained  by  the  recognized  visualization 
factor.  These  are  interesting  questions  and  ones  to  which  answers  could 
be  utilized  to  great  advantage. 

Several  studies  designed  to  seek  objective  answers  to  these  questions 
were  begun  or  were  planned  in  the  later  stages  of  the  war-time  research 
program,  but  time  did  not  permit  their  execution. 
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tUMiU  THIRIII1I. 


Mechanical  Tests1 


HISTORICAL  STATEMENT 

It  has  long  been  recognized  by  psychologists,  vocational-guidance 
authorities,  and  others  concerned  with  the  most  evident  employment  of 
individual  capabilities  that  differences  exist  in  the  abilities  of  individuals 
to  succeed  in  pursuits  involving  the  operation  and  utilization  of  mechani¬ 
cal  equipment.  Many  attempts  have  been  made  to  ntcasure  the  ability 
or  abilities  involved  and  thus  arrive  at  a  reliable  and  valid  basis  for 
predicting  success  in  such  tasks.  The  resulting  instruments  may  be  clas¬ 
sified  roughly  in  three  categories:  (1)  job-sample  tests,  (2)  manual- 
ability  tests,  and  (3)  papcr-and-pcncil  or  printed  tests.  This  chapter  is 
concerned  with  printed  tests.  No  extensive  recapitulation  will  be  n»a»le 
of  resuits  of  previous  civilian  research  in  this  area,  but  brief  considera¬ 
tion  of  some  of  the  outstanding  studies  may  constitute  a  useful  frame  of 
reference  within  which  to  consider  the  results  of  the  research  reported 
in  this  chapter. 

Some  Standard  Mechanical  Test* 

A  number  of  printed  tests  of  mechanical  abilities  have  been  available 
to  the  public  for  some  time.  Some  of  these  will  be  mentioned  briefly. 

The  Stcnquist  “Mechanical  Aptitude  Tests”  (4)  constitute  one  of  the 
earliest  attempts  (1921)  to  measure  mechanical  ability  or  aptitude  by 
means  of  printed  tests.  The  two  tests  include  mechanical-comprehension 
and  mechanical-information  material.  Results  on  these  tests  are  reported 
to  correlate  highly  with  ratings  of  proficiency  given  to  students  by  shop 
and  science  teachers.  As  a  first  effort  in  the  field  of  selection  of  work¬ 
ers  for  mechanical  tasks,  these  tests  arc  historically  important. 

Another  milestone  in  the  development  of  methods  of  measuring  me¬ 
chanical  aptitude  or  ability  was  provided  by  the  Cox  (2)  Mechanical 
Aptitude  Tests  (1928).  Tests  D  and  F.  consisted  of  material  of  tl* 
mechanical-comprehension  and  mechanical-information  types.'  Those 
tests  were  constructed  in  England  and  have  been  used  with  considerable 
success  in  that  country  in  selecting  individuals  for  mechanical  tasks.  The 
author  recognized  the  differential  effects  of  training  anil  expsritnee,  and 
attempted  to  make  the  tests  as  independent  as  possible  of  these  aspects. 

The  publication  in  I93fi  of  Thurstonc's  factor  analysis  of  57  tests  (5). 
among  which  was  a  "Mechanical  Movements'*  test,  presented  results  of  a 
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new  approach  to  the  .troblem  of  analyzing  and  evaluating  objective  tests. 
Some  discussion  of  this  approach  is  found  elsewhere  in  this  volume.  Un¬ 
fortunately,  no  clear  definition  of  the  factorial  composition  of  Thur- 
stonc’s  "Mechanical  Movements”  test  was  achieved,  probably  due  to  the 
fact  that  not  enough  "mechanical”  tests  were  included  in  the  matrix. 

Using  the  factor-analysis  method,  Harrell  (3)  examined  a  group  of 
mechanical  tests  and  found  five  factors.  These  were  identified  as  (1; 
verbal,  (2)  manual  dexterity,  (3)  youth,  (4)  spatial,  and  (5)  percep¬ 
tual.  Harrell  concluded  that  the  last  two  were  the  only  ones  that  uniquely 
identify  mechanical  tests.  A  factor  analysis  to  be  reported  in  this  chap¬ 
ter  docs  not  find  manual  dexterity,  since  only  printed  tests  were  involved. 
It  does  find  spatial  and  perceptual  variance,  in  confirmation  of  Harrell’s 
results,  but  two  other  factors  far  outweigh  those  two  in  most  printed 
mechanical  tests. 

The  list  of  traditional  printed  tests  designed  to  measure  mechanical 
aptitude  or  ability  includes  the  O’Rourke  Mechanical  Aptitude  Test  and 
the  Mechanical  Comprehension  Test  (Form  AA  by  Bennett,  Form  BB 
by  Bennett  and  Fry  (1)).  The  O’Rourke  test  includes  a  part  devoted  to 
pictorial-comprehension  items  and  another  part  containing  verbally  pre¬ 
sented  mechanical-information  items.  The  Bennett  tests  consist  entirely 
of  pictorially  presented,  practical,  mechanical  or  physical  problems.  Scores 
on  these  tests  are  reported  to  be  positively  correlated  with  success  in 
shop  work,  with  success  in  vocational  training  courses,  and  with  the  de¬ 
gree  of  complexity  of  mechanical  tasks  in  which  examinees  were  em¬ 
ployed.  Information  is  not  available,  however,  as  to  their  cor  .ation 
with  success  in  specific  mechanical  tasks.  A  test,  similar  to  Mechanical 
Comprehension  Test  BB,  was  constructed  by  Bennett  for  the  United 
Slates  Navy  for  air-ciew-selection  purposes.  High  correlation  was  re¬ 
ported  between  scores  on  this  test  and  success  in  pilot  training. 

Two  Lines  of  Research  Indicated 

This  brief  review  of  research  suggested  two  nr  rtant  aspects  to  be 
explored:  (1)  the  relationship  of  mechanical  tests  to  success  in  specific 
air-crcw  tasks,  and  (2)  the  factorial  content  of  printed  mechanical  tests. 
This  chapter  reports  results  of  research  that  has  explored  these  two 
areas  to  a  significant  extent. 

MECHANICAL  ABILITY  AS  RELATED  TO  AIR  CREW 
PERFORMANCE 

Job  Analysis  Findings 

It  is  generally  recognized  that  successful  performance  of  tasks  in¬ 
volving  the  use  or  operation  of  mechanical  devices  requires  certain  spe¬ 
cial  abilities.  It  has  been  assumed  by  some  who  have  been  interested  in 
the  problem,  however,  that  ability  to  succeed  in  such  pursuits  depends 
upon  factors  not  unique  to  mechanical  tasks.  On  the  basis  of  such  an 


298 


assumption,  job  analyses  of  tasks  involving  use  of  machines  might  not 
even  include  reference  to  machines,  or  their  operation,  as  such. 

In  general,  this  seems  to  be  true  of  the  various  job  analyses  of  air¬ 
crew  duties,  since  the  reports  typically  fail  to  mention  mechanical  abili¬ 
ties,  at  least  under  this  rubric.  No  explanation  is  made  for  this  omission, 
but  the  reasoning  of  the  preceding  paragraph  probably  identifies  one 
answer.  Another  explanation  might  be  that  the  levels  of  mechanical 
ability  required  are  'ow  enough  to  accommodate  most  or  all  who  reach 
the  training  stage. 

Mechanical  Requirements  for  Air  Crew 

The  opposite  hypothesis,  that  mechanical  ability  as  such  is  unique  and 
is  an  important  determiner  of  success  in  air-crew  training,  has  been 
adopted  by  some.  A  brief  review  of  the  mechanical  content  of  the  jobs 
of  pilot  and  bombardier  will  reveal,  in  part,  the  basis  of  this  hypothesis. 

The  pilot. — The  pilot  of  a  military  plane  is  in  full  charge  of  its  opera¬ 
tion  with  responsibility  for  its  proper  functioning  and,  j.i  the  case  of 
a  bombing  plane,  for  the  safety  of  the  entire  crew.  In  view  of  these  re¬ 
sponsibilities,  the  pilot  must  perform  certain  duties  on  the  ground  prior 
to  take-off.  These  duties  include  thorough  inspection  of  the  plane  to  de¬ 
termine  whether  it  is  in  proper  condition  for  the  take-off  and  for  safe 
operation  in  the  air.  At  first  in  ground  school  and  later  in  other  phases 
of  his  training,  the  pilot  is  instructed  in  the  construction  and  function 
of  the  airplane  and  is  trained  in  the  meticulous  performance  of  all  duties 
related  to  its  operation.  The  pilot  obviously  must  understand  well  the 
mechanical  parts  and  functions  of  the  airplane  in  order  to  make  an 
intelligent  and  exacting  inspection. 

While  the  plane  is  in  flight,  the  pilot  is  faced  with  the  task  of  observ¬ 
ing,  interpreting,  and  acting  upon  information  received  from  dials,  in¬ 
dicators,  etc.,  in  the  plane.  Especially  under  combat  conditions,  the  pilot 
may  frequently  be  forced  to  supervise  improvisation  of  repairs  cn  dam¬ 
aged  equipment  or  to  devise  means  of  replacing  destroyed  parts  of  the 
plane.  Such  emergency  action  requires  a  clear  grasp  of  practical  mechan¬ 
ics.  The  assumption  that  the  ability  to  devise  or  improvise  under  these 
conditions  varies  greatly  is  probably  justifiable. 

The  bombardier. — The  mechanical  aspects  of  the  bombardier’s  task 
are  principally  related  to  the  operation  of  the  bomb  sight.  Early  in  his 
training,  the  bombardier  begins  a  study  of  the  bomb  sight.  By  the  end 
of  his  training,  the  bombardier  has  studied  every  part  of  the  bomb  sight 
and  is  equipped  to  diaguost*  malfunctions,  and  to  make  minor  repairs. 
Before  every  training  or  combat  flight,  the  bombardier  ‘'preflights  ’  the 
sight.  This  operation  includes  checking  the  functioning  of  all  parts  of 
the  sight,  calibrating  indicators,  and  setting  the  sight  for  constant  data, 
such  as  field  elevation  and  the  like.  In  addition  to  the  care  of  the  bomb 
sight,  the  bombardier  has  full  responsibility  for  loading  and  arming  of 
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the  bombs  and  inspecting  of  bomb  racks  in  the  plane.  Thorough  know!-  . 
edge  of  their  proper  functioning  is  necessary  to  enable  the  bombardier  to 
insure  proper  and  safe  operation  of  racks  and  bomb  release  mechanism 
on  the  mission. 

Principal  flight  duties  of  the  bombardier  include  operation  of  the 
bomb  sight  and  the  automatic-pilot  equipment.  Ordinarily  these  tasks 
consume  only  a  few  minutes,  but  the  entire  success  of  the  mission  de¬ 
pends  largely  upon  the  accurate  performance  of  these  tasks.  In  emer¬ 
gency  situations,  the  bombardier  may  be  called  upon  to  improvise  a 
method  of  releasing  the  bombs,  opening  bomb  bays,  or  the  like. 

Measurement  of  Mechanical  Ability 

From  these  descriptions  of  pilot  and  bombardier  duties,  it  may  be  seen 
that  the  tasks  involved  include  a  great  deal  of  mechanical  content. 
Establishment  of  this  fact  is,  however,  only  the  first  step  in  the  process 
of  determining  aptitude  for  the  tasks.  Of'  equal  importance  is  the  man¬ 
ner  in  which  the  specific  ability  is  to  be  measured. 

Why  printed  tests  were  utilized. — The  problem  of  measuring  non- 
intellectual  or  partially  intellectual  abilities  by  means  of  printed  tests 
has  ever  been  a  difficult  one.  The  area  of  mechanical  ability  is  no  ex¬ 
ception,  and  many  are  prepared  to  claim  that  adequate  measurement  of 
this  ability  is  not  possible  by  such  techniques.  In  the  early  days  of  psy- 
cological  testing  in  the  Army  Air  Forces,  however,  it  was  imperative 
that  means  be  sought  to  measure  as  many  aptitudes  or  abilities  as  pos¬ 
sible  by  paper  and  pencil  methods.  This  was  due  largely  to  the  fact  that 
adequate  psychomotor  or  job-sample  tests  did  not  exist  in  many  areas  _ 
and  the  pressure  of  great  numbers  of  examinees  made  the  use  of  printed 
group  tests  wherever  possible  highly  desirable. 

There  are  adequate  reasons  why  this  method  of  testing  should  prove 
successful.  On  close  examination,  many  or  most  practical  mechanical 
problems  prove  to  have  an  important  intellectual  component  as  distin¬ 
guished  from  purely  manipulative  skill.  This  component  is  probably  not 
strictly  “abstract"  intelligence  and  certainly  is  not  the  same  as  verbal 
ability.  It  includes  the  ability  to  gain  insight  into  the  principles  involved 
in  mechanical  problems.  A  second  reason  is  that  it  logically  may  be  as¬ 
sumed  that  mechanical  insight  will  result  in  or  tend  to  result  in  solution 
of  practical  mechanical  problcms.- 

Causes  of  Individual  Differences  in  Mechanical  Abilities 

Owing  to  the  heterogeneity  of  the  group  of  prospective  air-crew 
members,  it  is  necessary  to  consider  the  factor  of  differential  mechanical 
experience.  Apparently,  unusual  amounts  of  information  in  the  field  of 
mechanics  may  stem  from  one  or  more  of  three  causes:  (1)  the  ten¬ 
dency  or  desire  to  seek  mechanical  experience,  (2)  superior  ability  to 
profit  by  mechanical  experience,  and  (3)  unusually  rich  opportunity  to 
gain  mechanical  experience. 
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Scores  on  niechanic.il  tests  may  reflect  individual  di (Terences  in  all 
three  of  these  aspects,  depending  upon  emphasis.  Mechanical  interest 
may  he  prognostic  in  that  the  air-crew  job  undertaken  promises  satisfac¬ 
tion  of  such  an  interest.  Ability  to  profit  bv  mechanical  experience  is 
also  a  favorable  trait  for  learning  the  mechanical  aspects  of  flying.  Gen¬ 
erous  opportunity  for  mechanical  experience  would  be  of  value  only 
insofar  as  it  resulted  in  habits  or  information  that  transfer  to  air-crcw 
performance.  A  student  who  makes  a  high  mechanical-test  score  because 
of  unusual  opjjortunity  is  probably  a  less  good  risk  for  training  than  one 
whose  opportunity  may  have  been  limited  hut  whose  aptitude  for  master¬ 
ing  mechanical  tasks  is  high. 

The  first  two  features — interest  ami  aptitude — are  probably  positively 
related.  To  measure  the  one  is  thus  to  measure  ♦he  other  to  some  ex¬ 
tent.  Since  they  arc  both  probably  favorable  to  success,  their  intermix¬ 
ture  is  not  a  serious  matter. 

Unusual  opportunity  to  gain  mechanical  experience  would  probably 
not  correlate  appreciably  with  either  superior  aptitude  for  things  me¬ 
chanical  or  with  the  tendency  to  seek  mechanical  experience.  One  would 
therefore  attempt  to  minimize  the  variance  in  opportunity  in  favor  of 
variances  in  one  or  the  other  of  the  first  two.  As  it  turns  out,  measure¬ 
ments  represent  an  intermingling  of  the. three,  and  one  can  only  trust 
that  the  prognostic  components  are  not  too  much  submerged  for  prac¬ 
tical  purposes. 

The  Plan  of  Research 

In  view  of  the  apparently  complicated  nature  of  mechanical  ability,  it 
appeared  advisable  to  employ  several  approaches.  Preliminary  "armchair 
analysis”  indicated  two  types  of  measures  which  should  be  useful :  meas¬ 
ures  of  mechanical  comprehension  and  measures  of  mechanical  informa¬ 
tion.  The  original  plan  included  also  measures  of  pattern  comprehension 
because  such  measures  had  been  traditionally  included  in  the  mechanical 
area.  These  tests  were  found  to  have  little  in  common  with  the  mechani¬ 
cal  tests,  however,  and  arc  described  elsewhere  in  this  volume.  A  test  of 
physics  was  constructed  for  another  purpose,  but  because  of  its  dose 
relationship  (superficial,  at  least)  to  mechanics,  it  is  treated  in  this 
chapter. 

MECHANICAL  COMPREHENSION 
Definition  and  Rationale 

The  area  covered  by  mechanical  comprehension  is  broad  and  includes 
a  wide  variety  of  possible  approaches.  In  general,  mechanical  compre¬ 
hension  may  be  defined  as  the  ability  to  follow,  to  understand,  and  to 
predict  the  result  of  the  operation  of  mechanical  devices  or  machines. 
Results  obtained  from  the  observation  of  actual  machines  would  prot>- 
ably  yield  the  most  valid  measures,  but  two-dimensional  pictorial  reprt- 
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Sentations  appear  to  be  a  fair  substitute  for  actual  machines.  This  sub¬ 
stitution  is  based  upon  the  assumption  that  solution  of  two-dimensional 
problems,  where  inspection  but  not  manipulation  is  possible,  involves 
the  same  or  similar  intellectual  functions  as  those  required  in  solving 
problems  with  three-dimensional  machines.  Variety  can  be  secured  by 
presenting  machines  at  various  levels  of  complication,  ranging  from  the 
simple  one*  or  two-part  machine  to  the  engine  that  contains  a  multitude 
of  parts. 

Another  approach  explores  knowledge  or  comprehension  of  common 
physical  laws  encountered  in  everyday,  nontechnical  experience.  Good 
rationale  for  utilizing  this  approach  is  found  in  the  fact  that  the  prob¬ 
lems  used  may  include  material  common  to  the  experience  of  all  or  al¬ 
most  all  examinees.  This  fact  should  tend  to  minimize  the  effects  of 
differential  experience.  Additional  justification  of  this  approach  lies  in 
the  fact  that  principles  involved  in  complicated  machines  are  also  in¬ 
volved  in  much  simpler  and  more  fundamental  for  m  in  mechanical  or 
physical  phenomena  of  everyday  life. 

Mechanical  Principles,  CI903A  * 


An  understanding,  at  least  in  a  naive  manner,  of  basic  principles  gov¬ 
erning  mechanics  appears  to  be  fundamental  to  the  solution  of  even 
relatively  simple  specific  mechanical  problems.  Obviously,  it  would  be 
impracticable  to  construct  a  large  number  of  tests,  each  designed  to  ex¬ 
plore  but  one  principle  involved  in  such  an  activity  as  flying,  even  if  it 
were  possible  to  isolate  the  principles  and  to  prove  their  pertinence.  A 
single  test  which  would  explore  the  examinee’s  familiarity  with  a  large 
number  of  these  basic  principles,  however,  appeared  to  be  feasible.  Such 
a  test  was  constructed  and  given  the  appropriate  title,  “Mechanical 
Principles  Test” 

A  preliminary  form  of  this  test  (CI903AX)  constituted  the  explora¬ 
tory  instrument  upon  which  Form  CI903A  was  based.  A  study  of  Ben¬ 
nett  and  Fry's  "Mechanical  Aptitudes  Test,”  Thurstonc’s  "Mechanical 
Movements  Test,”  and  other  similar  tests  was  made  in  preparing  the 
preliminary  form.  An  effort  was  made  to  give  the  test  face  validity  by 
introducing  practical  mechanical  principles  in  terms  of  aviation  situa¬ 
tions  whenever  possible.  Eighty-six  items  were  drawn  up  and  adminis¬ 
tered  experimentally.  On  the  basis  of  item  analysis,  the  30  items  yielding 
the  highest  internal-consistency  phis,  and  at  the  same  time  the  most  ap¬ 
propriate  difficulty  indices,  were  selected  to  be  used  in  Form  CI903A. 

Description. — Mechanical  Principles  Test  (CI903A)  consists  of  2 
sample  practice  items  and  30  scored  items,  all  presented  pictorially.  In 
each  item,  the  examinee  is  asked  to  select  the  answer  that  describes  most 
accurately  what  is  happening  or  will  happen  in  the  pictured  situation. 
The  sample  items  in  figures  13.1  and  13.2  are  typical.  The  problem  in 
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figure  13.1  is  to  determine  which  plane  is  about  to  turn  to  the  left.  In 
figure  13.2,  the  examinee  is  required  to  determine  which  hook,  if  either, 
is  capable  of  lifting  the  heavier  weight. 


FIGURE  13.1 

SAMPLE  PROBLEM  OF  MECHANICAL  PRINCIPLES, 

Cl 903 A 

2.  Which  plane  is  about  to  turn  left? 

2— A  Plane  A 
2-B  Plane  B 
2-C  Both  plane  A  and  B 

(1)  Internal  characteristics. — The  examinee  is  required  to  choose  one 
of  three  alternatives.  In  most  cases,  the  third  alternative  is  a  midpoint 
or  neutral  position  between  the  other  two  alternatives,  such  as  "equal,*’ 
"the  same,"  "either,"  “neither,"  “both,"  etc.  Unfortunately,  it  was  im¬ 
possible  to  construct  the  test  in  such  a  way  as  to  produce  as  many  cor¬ 
rect  answers  in  the  neutral  categories  as  in  each  of  the  other  two. 
Twenty-eight  items  contain  third  alternatives  of  this  type,  of  which  only 
five  are  keyed  as  correct 

(2)  Administration. — Instructions  to  the  examinee  arc  very  simple, 
being  presented  principally  in  connection  with  two  sample  items.  Three 
or  4  minutes  suffice  for  the  directions,  and  15  minutes  arc  allowed  for 
the  scored  items.  Although  rapid  work  is  required  to  complete  the  test 
in  this  time,  80  percent  to  90  percent  of  ihe  examinees  are  able  to  finish. 
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SAMPLE  PROBLEM  OF  MECHANICAL  PRINCIPLES, 

CI903A 

.30.  On  which  hook  could  you  li  f t  the  heavier  weight  ? 

30-A  Hook  A 
30-B  Hook  B 
30-0  Equally  heavy  on  both 

(3)  Scoring. — In  view  of  the  fact  that  very  few  examinees  choose 
the  third  alternative,  formulas,  first,  of  R-W  and,  later,  of  2R-2W 
were  used  in  scoring  the  test.  These  formulas,  however,  produce  a  large 
number  of  negative  scores,  which  are  rather  difficult  to  handle  in  com¬ 
bining  them  by  machine  with  other  scores,  so  formulas  of  2R-2W+20, 
2R-2W+30,  and  2R-2W+40  were  successively  employed  in  order  to 
secure  better  adaptation  to  aggregate-weighting  requirements. 

Statistical  results. — Numerous  statistics  arc  available  on  this  form  of 
the  test 

(1)  Distribution  statistics. — Table  13.1  presents  distribution  data  for 
several  samples. 


Tadi-E  13.1. — Distribution  statistics  for  Mechanical  Principles,  C 1903.4,  bast!  upon 
samples  of  unclassified  aviation  students 
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(2)  Optimal  scoring  formula. — A  study  to  determine  the  optimal 
scoring  formula  to  maximize  pilot  validity  showed  that  the  rights  and 
wrongs  should  be  veighted  1.00  and  +0.18,  respectively.  This  yielded  an 
estimated  validity  of  0.47  for  primary  pilot  training.  The  data  from 
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which  this  weighting  was  derived  are  based  on  a  sample  of  1,094  stu¬ 
dents  in  pilot  training  in  class  4311,  originally  tested  at  Psychological 
Research  Unit  No.  2.  The  means  for  rights  and  wrongs  scores  were 
18.8  and  10.3,  respectively.  The  intercorrelations  were:  r*c— 0.47, 
»Vc=”*0.41,  and  r*i,  =  —0.91,  in  which  subscript  “C"  refers  to  the  cri¬ 
terion.  For  practical  scoring  purposes,  a  weight  of  zero  is  recommended 
for  the  wrongs  score  when  the  test  is  used  for  the  selection  of  pilots.  In 
view  of  the  almost  perfect  negative  (—0.91)  correlation  between  rights 
and  wrongs,  however,  a  weight  of  —  1  which  was  used  in  scoring  would 
produce  validity  only  slightly  lower  than  tliat  estimated  for  the  recom¬ 
mended  weight  of  zero. 

(3)  Internal  consistency. — An  item  analysis,  basid  uj>on  the  perform¬ 
ance  of  the  highest  95  (25  percent)  and  lowest  95  of  3S0  unclassified 
aviation  students  tested  at  Psychological  Research  Unit  No.  3  yielded  in¬ 
ternal-consistency  phi  coefficients  ranging  from  0.22  to  0.67.  The  ntean 
phi  value  was  0.44,  with  a  standard  deviation  of  0.12.  As  indicated  by 
the  data,  the  items  proved  to  be  quite  homogi  neons.  This  consistency  was 
achieved  in  large  part  by  selecting  from  the  86-item  preliminary  fonn 
the  items  with  the  highest  phi’s  and  the  most  appropriate  difficulty. 

(4)  Reliability  coefficients. — Several  reliability  estimates  arc  given  in 
tabic  13.2. 


Tawk  1J.2.— Reliability  coefficients  for  Mechanical  Principles,  C 1903 A 


N 


*500 

>240 

•240 

'25S 

»2SS 
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Oild-ma  . .  . . . 

OiM  cven  . 

Odit-crm  . 

Kiakr-RkliinliM  IV 
Tmruoi' . 


0.74 

.71 

.4* 

.71 

.W 


•TrMcd  January  23.  1911.  at  I’-ycKoloicical  Krvrarth  Unit  No.  2. 

*  Te'icd  April  104 J  at  1‘,/cholotnral  Kr«ar<k  Unit  No.  J.  Non 

*  Tnl  rtlril  inter* »1  not  reported. 


(5)  Difficulty. — Based  upon  the  analyses  referred  to  previously,  the 
mean  difficulty  of  the  items,  corrected  for  chance,  is  approximately  0.55, 

which  is  considered  to  be  satisfactory. 

(6)  Factorial  composition. — The  Mechanical  1  rinciplcs  test  was  in¬ 
cluded  in  several  factor  analyses.  Two  of  these  contained  pnncipally 
mechanical  tests  plus  a  few  tests  from  the  classification  battery.  These 
analyses  agree  in  the  main  in  accounting  for  the  variance  of  the  test. 
The  highest  loading  incurs  in  a  mechanical-experience  factor,  its  mean 
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primary  pilot  training 
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loading  in  the  various  analyses  being  0.60.  The  second  highest  loading 
(0.51)  is  in  a  visualization  factor.  Other  significant  loadings  are  in  the 
spatial-relations  (0.22),  verbal  (0.20),  and  gcncrai-rcnsoning  (0.20) 
factors.  Its  coinmunality  is  found  to  be  0.84  when  all  factorial  results 
are  summarized. 

(7)  Test  validity. — Preliminary  validation  data,  gathered  on  two  dif¬ 
ferent  samples  of  aviation  students  in  which  little  selection  had  been 
made  by  disqualification  at  the  time  of  classification,  arc  shown  in  table 
13.3.  As  indicated  by  table  13.3,  there  was  good  evidence  of  the  validity 
of  the  Mechanical  Principles  test  for  pilot  training.  The  test  became  a 
part  of  the  classification  battery  of  !  December  1942,  and  validation 
data  were  subsequently  accumulated  for  all  three  air-crew  positions  and 
for  certain  other  specialties.  Data  for  several  samples  are  given  in  tables 
13.4  through  13.7. 


Table  137. —  Validities  of  Mechanical  Principles,  CI903A,  for  combat  crew 

specialties 


Ctmi 

m 

Criterion 

r* 

211 

2UI 

194 

194 

171 

171 

21S 

1S3 

0.42 

.01 

-.02 

.IS 

II 

.46 

-.02 

.27 

Final  examination  . 

Final  examination  . 

Radio  operator  mechanic* . 

Radi*  operator  mcchxnici . 

Graduation-elimination  •  •  . . 

Average  grade* . . . 

•  I'rodiKt  moment  correlation.  ......  _ 

•  TrMrd  with  the  Drcember  1942  Battery  at  P»ycholo*ical  Research  Unit  No.  J. 

•TeiteA  with  the  December  1 94.1  Buttery  al  I\ycholoKical  Research  Untl*  So*.  I.  I.  and  J 

•  In  tli't  lJ-t<  al  Ft.  Myer*.  Tested  al  l'»ychulogical  Research  Unit*  Nos.  1.  1.  aoa  1. 

•  A  eery  unreliable  criterion.  .  ...  .  . _ ■  , 

•  In  clas*  4148  ai  Ft.  Myer*.  Toted  at  P»y«rb»logical  Research  Unit*  No*.  >.  ?.  a**®  »• 

As  indicated  by  the  validation  data,  the  Mechanical  Principles  test 
shows  greatest  promise  as  a  pilot-selection  instrument.  The  biserial  cor¬ 
relations  against  the  graduation-elimination  criterion  in  primary  train¬ 
ing  of  the  samples  given,  combined  by  means  of  fishers  z,  is  0.33.  The 
correlation  with  graduation-elimination  from  Ixunbardier  training  (com¬ 
bined  by  the  same  method)  is  slight  (0.13).  The  correlation  with  navi¬ 
gator  training  success  (combined  b)  the  same  method)  is  moderate 

(025). 

(8)  Item  tv!  id  it  y. — Subsequent  to  the  inclusion  of  the  Mechanical 
Principles  test  in  the  classification  battery,  pilot  item -validity  studies 
were  made.  The  results  are  reported  in  table  U  S.  The  phis  of  the  two 
samples  correlate  0.68.  A  study  of  the  relationship  In-tween  internal- 
consistency  phis  and  item  validities  revealed  a  marked  positive  correla¬ 
tion  (0.64).  This  rvlaiimiship  strongly  supp.rts  the  practice  of  selecting 
items  for  the  IcM  on  the  basis  of  inU  rn.il  vonsiMeiicv  phi  values. 

Win  n  early  data  indicated  that  the  validity  of  Mechanical  Principles 
was  considerable,  while  lint  of  Physics  was  very  slight  for  pilots,  it  ap¬ 
peared  desirable  to  examine  the  items  of  the  Mechanical  Principles  test 
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Table  13.8. —  Validity  of  items  of  Mechanical  Principles,  C190JA,  for  primary  pilot 


training,  graduation-elimination  criterion 


N, 

ft 

SD* 

Ring*  •(  # 

Low 

Huk 

1,091* 

960* 

0.S4 

.SI 

Oil 

.11 

o.ot 

.06 

0  07 
-.OJ 

0.22 

.2* 

'  TeMtil  Stpi.  I  io  Nov.  10,  1912,  ai  rtyjmlo^ul  Kt'rarch  Unit  N*.  2. 
1  TcMtd  in  Ocf.  19IJ  u  INychoIogirtfl  Kcvcarcfi  Unit  No.  1. 


with  the  view  to  eliminating  those  items  that  are  highly  correlated  with 
Physics  and  are  at  the  same  lime  tow  or  moderate  in  pilot  validity.  Since 
it  was  supposed  that  the  total  score  on  mechanical  principles  reflected 
some  variance  in  knowledge  of  physics,  it  was  recognized  that  the  total 
score  on  Mechanical  Princil|>es  was  not  the  best  criterion  for  internal* 
consistency  studies.  It  was  pmjxjsed  that  the  total  score  on  Mechanical 
Information  would  be  an  excellent  criterion,  Ixxausc  it  was  presumably 
free  from  physics  variance.  The  Mechanical  Principles  items  were  there¬ 
for  analyzed,  using  scores  in  both  the  Mechanical  Information  and  the 
Physical  Principles  tests  as  criteria.  Examination  of  the  two  sets  of  phi 
values  from  these  analyses  revealed  so  strong  a  positive  relationship  be¬ 
tween  them,  however,  that  any  selection  on  the  proposed  basis  was  diffi¬ 
cult  to  justify. 

At  a  later  time,  when  factorial  composition  of  the  Mechanical  Princi¬ 
ples  test  was  better  known,  it  was  desired  to  segregate  items  into  separate 
pools,  each  relatively  pure  with  respect  to  one  of  the  factors.  It  was  as¬ 
sumed  that  some  items  were  strongly  mechanical,  others  spatial,  and 
still  others  visualizing  items.  The  Mechanical  Information,  the  Complex 
Coordination,  and  the  Pattern  Comprehension  tests  were  chosen  as  criteria 
for  the  mechanical,  spatial,  and  visualization  factors  respectively.  The 
three  sets  of  phi  coefficients  were  agam  highly  intcrcorrelated,  making 
it  impossible  to  sort  the  items  into  three  factor  categories,  as  had  been 
intended. 

Although  it  is  recognized  that  the  criterion  tests  were  not  pure  meas¬ 
ures  of  the  factors  they  represented,  it  is  apparent  that  items  of  the 
Mechanical  Principles  test  do  not  fall  into  factor  categories  hut  that  they 
are  typically  complex  factorially. 

Eivluotion — As  indicated  by  the  data  presented,  the  Mechanical 
Principles  Test,  CI903A.  proved  to  lie  one  of  the  most  useful  pilot- 
sclcction  instruments  available.  It  combines  in  its  lota!  variance  large 
amounts  of  two  highly  valid  factors  —  mechanical  experience  and  visual¬ 
ization — plus  a  smaller  amount  of  another  highly  valid  factor  spatial 
relations.  Its  aonerror  variance  and  its  validity  arc  fully  accounted  for 
by  known  common  factors. 

Although  proof  of  validity  for  air-crcw  success  alone  is  available,  tlx 
factorial  findings  strongly  suggest  the  possibility  of  using  such  a  test  in 
selecting  for  a  wide  variety  of  mechanical  pursuits.  Its  only  defect  is  its 
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lack  of  purity-  When  a  strong  test  of  visualization  is  available,  this  test 
could  well  be  replaced  with  a  combination  of  the  visualization  test  and 
Mechanical  Information.  Each  component  could  then  be  appropriately 
weighted,  depending  upon  the  criterion  one  desired  to  predict. 

Mechanical  Principles,  CI903B  * 

•  This  form  of  the  test  is  the  successor  to  Form  CI903A  already  de¬ 
scribed.  It  appeared  that  revision  and  introduction  of  new  material  might 
provide  a  test  which  would  be  even  more-  valid.  Approximately  200 
items,  potentially  useful  in  a  new  form,  were  designed  and  evaluated 
carefully  against  the  following  requirements: 

(1)  Apparent  relationship  to  the  more  valid  items  in  Form  A. 

(2)  Minimum  involvement  with  physics  information  and  reading 
comprehension. 

(3)  Moderate  difficulty  (around  0.50). 

(4)  Adequate  number  of  alternatives  (preferably  five). 

From  these  200  items,  120  were  selected  on  the  basis  o.'  the  above  re¬ 
quirements.  These  120  items  were  separated  into  two  groups  of  60  each, 
as  nearly  comparable  in  difficulty  and  content  as  possible.  These  two 
groups  became  Forms  CI903BX1  and  CI903BX2  and  contained  identical 
instructions  and  illustrative  items.  The  two  forms  were  administered  to 
1,920  classified  pilots  and  the  results  employed  in  making  an  item  analy¬ 
sis.  The  upper  and  lower  groups  were  determined  by  total  scores  on  the 
tw  mis  combined,  since  both  forms  were  given  to  the  same  examinees. 
On  this  basis,  internal-consistency  phis  varied  from  0.00  to  +0.51  on 
BXlt  and  from  --2.02  to  +0.55  on  BX2.  Twenty-four  items  in  BX1 
and  twenty-three  in  BX2  yielded  phis  of  +0.35  or  above.  In  view  of  the 
high  correlation  (0.64)  previously  found  between  internal-consistency 
phis  and  item  validities  on  J  ortn  A,  major  weight  was  given  to  internal 
consistency  in  selecting  items  from  the  BXI  and  BX2  forms.  Other  con¬ 
siderations  were  difficulty  and  the  functioning  of  misleads.  Because  of 
the  high  validity  of  several  items  in  Form  A,  14  items  were  selected  to 
be  redrawn  and  used  in  Form  B.  Twelve  items  from  BXI  and  14  items 
from  BX2  completed  the  group  selected  for  Form  B.  Unfortunately, 
time  did  not  permit  the  obtaining  of  validity  data  on  the  items  in  BXI 
and  BX2  before  this  selection  of  items  was  made. 

Description. — This  form  of  the  Mechanical  Prino.  cs  test  contains 
40  scored  items  of  the  same  general  type  as  those  in  Form  A.  The  in¬ 
structions  arc  similar  to  those  for  Form  A. 

(t)  Internal  characteristics.— The  number  of  misleads  ranges  from 
three  to  five  and  averages  four  per  item.  There  was  a  strong  attempt  to 
make  all  misleads  functional,  with  the  result  that  the  alternatives  in  this 
form  are  much  more  uniform  in  appeal  than  are  those  in  Form  A. 

•  Developed  at  Psychological  Research  Unit  No.  3.  Chief  contributor*  •  T/Sft.  Paul  C.  Dari*, 
S/Sgt.  Uenjamin  Fruchter,  S/Sgt.  Wayne  S.  Zimmerman, 
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(2)  Administration. — The  one  sample  item  and  directions  require 
about  5  minutes.  The  test  time  allowance  is  20  minutes. 

(3)  Scoring. — The  test  was  first  scored  R  —  W/2,  but  this  was  later 
changed  to  R— W/2+20  to  eliminate  negative  scores. 

Statistical  results.  (1)  Distribution  statistics. — Scored  with  the  form¬ 
ula  R— W/2+20,  the  test  yielded  the  data  given  in  table  13.9. 


Table  13.9. —  Distribution  of  scores  on  Mechanical  Principles,  C 190 JR 


Group  ! 

N 

M 

St) 

1.500 
1,020 
1,1  <6 
888 

12.1 

9.1 

10.4 

9.0 

15.4 

*.« 

15.5 

9.9 

1  Test t«l  November  1943  at  ISytholuKical  Kcsfjfch  No*.  I,  2,  niiM. 

*  Tc>ie«l  November  J‘»43  at  Medical  ami  Psychological  KxamimnK  Unit*  No.  4  ihrouRii  10. 

*  In  class  441.  Tested  November  1943  at  Psychological  Research  Units  Nos.  I,  2 ,  and  ). 

4  Class  o t  1946. 

(2)  Internal  consistency. — Internal-consistency  phi  values  are  avail¬ 
able  from  Form  A,  BX1,  or  BX2,  on  all  items  in  Form  B.  The  phi  values 
of  the  items  in  their  original  tests  have  a  mean  of  0.16,  while  the  mean 
phi  of  the  items  as  obtained  by  analysis  of  Form  B  is  0.4.+  Hie  results 
of  the  item  analyses  arc  given  in  table  13.10.  The  two  sets  of  phis 
correlate  0.47. 


Table  13.10.- —  Comparison  of  internal-consistency  flii  values  obtained  from 
Mechanical  Principles,  CI90JR,  and  preliminary  f  or  mi _ 
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‘iVl0;  „;„ier  50  percent  and  lower  50  percent  of  960  uncja.Mfie.J  ariition  •ludrnts. 

Using  Upper  2S  percent  and  lower  25  percent  of  800  uncUsnfied  aviation  atudenti. 

(3)  Reliability  coefficients.— Two  estimates  of  the  reliability  of  the 
st  are  given  in  tabic  13.11.  The  correlation  between  the  two  expen- 
enta!  forms  (BX1  and  BX2)  is  0.70,  based  upon  970  cases.  The  best 
timate  of  reliability  may  lie  between  the  two  limits  of  0.70  ami  0.S3, 
it  the  coinnumality  (0.93)  strongly  suggests  that  the  higher  figure  is 

rarer  the  correct  value. 

Table  13.11.-  Reliability  coefficients  for  Mechanical  Principles,  CIWII,  based 

.1  /  . I _ 1A  .fMrffl'.lM  titl/i.'.'it  t 
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rr  194)  battery. 

(4)  Difficulty. — In  order  to  establish  norms  Form  B  was  admin, s- 
red  to  530  unclassified  students  in  Octolier  1943  at  Psychological  Re 


311 


search  Unit  No.  3.  This  sample  yielded  a  mean  score  of  13.4  and  a 
standard  deviation  of  9.1.  The  mean  difficulty  index,  corrected  for 
chance,  was  0.30.  In  subsequent  samples  of  classified  pilots,  however, 
this  figure  rose  to  0.44,  which  is  similar  to  the  difficulty  level  of  Form  A. 

(5)  Factorial  composition. — Although  Form  B  of  Mechanical  Princi¬ 
ples  was  in  the  classification  battery  for  a  longer  period  than  Form  A, 
it  did  not  appear  in  so  many  factor  analyses,  and  consequently,  some¬ 
what  less  is  known  concerning  its  factorial  composition.  Enough  infor¬ 
mation  is  available,  however,  to  indicate  close  factorial  similarity  to 
Form  A.  The  mechanical-experience  and  visualization  factor  loadings 
remain  approximately  the  same,  0.58  and  0.54  respectively,  while  the 
loading  in  the  spatial-relations  factor  is  significantly  lower  (0.12)  than 
that  for  Form  A  (0.22).  The  loading  in  the  verbal  factor  (0.03)  is  much 
lower  than  the  0.20  for  Form  A,  and  the  loading  in  the  general-reasoning 
factor  (0.34)  is  markedly  greater  than  in  Form  A  (0.20).  It  was  con¬ 
cluded,  however,  that  little  change  of  importance  had  taken  place  in  the 
factorial  content  of  the  test  in  the  change  from  Form  A  to  Form  B.  The 
communality  for  Form  B  reached  the  unusual  level  of  0.93.  The  gain 
over  that  for  Form  A  is  accounted  for  by  two  new  factors  in  which 
Form  B  has  loadings — carefulness  (0.17)  and  space  III  (0.28) — fac¬ 
tors  that  did  not  appear  in  analyses  including  Form  A. 

(6)  Test  validity. — Preliminary  statistics  on  the  internal  consistency 
of  items  selected  for  Form  B  indicated  that  higher  validity  could  be  ex¬ 
pected  for  this  form  than  was  obtained  with  Form  A  (if  the  relationship 
between  internal  consistency  and  item  validity  found  in  Form  A  (r  = 
0.64)  also  prevailed  in  Form  B).  Unfortunately,  from  the  standpoint 
of  predicting  validities  in  this  manner,  the  total-score  validity  for  Form 
B  is  about  the  same  (0.34)  if  not  slightly  lower. 

Subsequent  to  the  inclusion  of  Form  B  in  the  classification  battery, 
validation  data  were  gathered  for  3,146  pilots  and  two  classes  of  WASP 
trainees.  These  data  are  given  in  table  13.12. 


Tablk  13.12. —  Validity  of  Mechanical  Principles,  CI903B,  for  various  avialion 

trainees 


Group 

Criterion 
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Pilots  In  primary 
training*  ...... 

Graduation-ciimiiiatior 

3.146 

0.84 

35.95 

32.39 
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0.23 
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training*  . 
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1.676 

.89 
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.22 

•.35 

WASPs*  . 
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91 

.80 

24.41 

22.22 

6.18 

.20 

•  •  •  • 

WASP** . 

Graduation-ciimiiiatior 
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.61 

23.60 

20.00 

6.70 

.33 

•  *  •  * 
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1,,, 

Air  mechanic  . . . . 

Final  grade* . 

254 

.... 

•JJ 

battery. 


incrf,  .  ....  .  .  .. 

•  Assuming  an  unrestricted  stanine  standard  deviation  of  1.90.  .  .  .  , 

•  In  c|j^j  44J.  Tested  at  Psychological  He search  Unit  No.  J  with  the  November  194J  battery. 

•  Assuming  rn  unrestri*  tr<l  slantne  standard  deviation  of  i.83.  ,,  w 

•WASl*  is  the  abbreviation  for  Women’*  Auxiliary  Service  Pilot*.  91  case*  in  cum  44-W-y, 

104  case*  in  ciass  44-W-8. 

•  Product-moment  correlation. 


The  corrected  pilot  validities  of  0.33  and  0.35  were  achieved  not¬ 
withstanding  a  general  decrease  in  validity  of  classification  instruments, 
as  typified  by  the  Complex  Coordination  Test  which  suffered  a  reduc¬ 
tion  in  validity  from  approximately  0.36  to  approximately  0.32,  between 
the  July  1943  and  November  1943  batteries.  The  validities  found  for 
the  two  small  samples  of  WASP  trainees  indicate  that  the  predictive 
value  of  the  test  is  not  limited  to  the  male  sex. 

(7)  Item  validity. — A  sample  of  704  primary  pilot  trainees  in  class 
44H,  600  of  whom  graduated,  yielded  a  mean  validity  phi  of  0.08,  with 
a  range  of  values  from  —0.01  to  0.16,  and  a  standard  deviation  of  0.05. 

Evaluation. — Like  Form  A  of  Mechanical  Principles,  Form  B  proved 
to  be  one  of  the  most  valid  instruments  used  in  selecting  pilot  trainees 
for  the  Army  Air  Forces.  The  efforts  to  produce  a  test  more  valid  than 
Form  A  had  apparently  failed.  With  due  allowance  for  a  changed  cri¬ 
terion,  as  mentioned  before,  a  part  of  the  failure  was  due  to  the  reduced 
loading  in  spatial  relations. 

The  predicted  validity  for  Form  B  is  0.31  (see  table  28.18).  The  loss 
of  spatial  variance  was  not  a  serious  matter,  since  it  was  covered  by 
other  tests.  Other  available  data,  such  as  validity  for  air  mechanic 
trainees,  suggest  wider  usefulness  for  this  test. 

Item-validity  data  indicate  that  selection  of  items  to  yield  maximum 
pilot  validity  had  probably  not  yet  been  achieved.  While  the  test  is  also 
valid  in  the  selection  of  women  pilots,  its  value  for  this  purpose  is  some¬ 
what  less  than  in  the  case  of  men,  due,  no  doubt,  to  the  narrower  range 
of  ability  (probably  the  mechanical-experience  component)  as  indicated 
by  smaller  standard  deviations  (see  table  13.12). 


Mechanical  Functions,  CI907AX  * 

This  test  was  constructed  for  the  purpose  of  measuring  (1)  knowl¬ 
edge  of  tools  and  instruments  and  (2)  ability  to  comprehend  the  method 
of  operation  of  machines.  The  latter  was  conceived  as  a  rather  compli¬ 
cated  function  covering,  among  other  things,  an  integration  of  specific 
mechanical  abilities,  such  as  those  measured  in  the  Mechanical  Principles, 
Mechanical  Information,  and  Mechanical  Movements  tests.  The  under¬ 
standing  of  the  operation  of  individual  phrts  is  probably  necessary  to  an 
understanding  of  the  operation  of  a  machine.  Such  specific  knowledge 
or  ability,  however,  may  not  insure  understanding  of  the  operation  of 
the  machine  as  a  whole. 

Description. — This  test  is  constructed  in  two  parts  that  arc  quite  dis¬ 
similar.  Part  I  consists  of  items  showing  pictures  of  tools  that  arc  to  be 
identified.  This  part  was  subsequently  revised  and  became  the  Tool  Func¬ 
tion  test.  In  some  of  the  problems,  a  single  tool  is  pictured,  and  the  ex¬ 
aminee  is  required  to  identify  its  use  from  a  list  of  alternatives.  Other 
problems  picture  five  tools  from  which  one  is  to  be  selected  as  having  a 
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certain  specified  characteristic  or  use.  The  problem  shown  in  figure  13.3 
is  typical  of  the  items  in  this  part.  In  this  problem,  the  examinee  is  ro- 


FIGURE  13.3 

SAMPLE  PROBLEM  OF  PART  I,  MECHANICAL  FUNCTIONS, 

CI9Q7AX 

quiretl  to  identify  the  tool  that  is  best  for  measuring  outside  work  on  a 

lathe. 

Part  II  consists  of  items  showing  pictures  of  machines  of  varying 
degrees  of  complexity.  The  problems  consist  of  identifying,  from  a  list 
of  alternatives,  the  functions  of  cither  the  whole  machine  or  of  certain 
specified  parts.  Several  of  the  problems  require  that  this  identification 
be  made  in  terms  of  analogous  parts  of  two  different  machines.  A  typical 
item  is  shown  in  figure  13.4.  The  examinee  is  required  to  discover  what 


FIGURE  13.4  • 

SAMPLE  PROBLEM  OF  PART  II,  MECHANICAL  FUNCTIONS, 

CI907AX 

parts  of  the  drum  pump  do  the  same  things  as  parts  1  and  2  of  the  air¬ 
cooled  cylinder. 

(1)  Internal  characteristics. — There  arc  15  items  in  each  part. 

(2)  Administration. — In  view  of  the  simplicity  of  the  task  in  this 
test,  directions  are  limited  to  general  instructions  concerning  the  method 


j  314 

-4  -  » 


I 


of  answering,  guessing,  and  timing.  These  instructions  require  about  3 
minutes.  Six  minutes  working  time  is  allowed  for  part  I,  and  10  minutes 
for  part  II. 

(3)  Scoring. — The  test  is  scored  with  the  formula  R— \V/4.  Part 
scores  were  computed  and  used  exclusively  in  view  of  the  dissimilarity 
of  the  parts. 

Statistical  results. — This  test  is  one  of  those  included  in  a  battery  of 
mechanical  tests,  and  rather  complete  statistics  are  available,  although 
the  samples  are  not  large.  The  data  are  for  unclassified  aviation  students 
tested  in  August  1942  at  Psychological  Research  Unit  No.  3,  and  for 
those  of  this  group  who  entered  primary  pilot  training  in  class  43E. 

(1)  Distribution  statistics. — The  papers  of  those  who  eventually  went 
to  primary  pilot  school  (N=78)  yielded  a  mean  score  of  7.3  and  a 
standard  deviation  of  3.7  on  part  I  (Tool  Functions),  and  a  mean  score  of 
8.5  and  a  standard  deviation  of  3.6  on  part  II.  At  the  time  this  test  was 
given,  there  was  little  selection  of  students  sent  to  pilot  training. 

(2)  Difficulty. — The  test  is  of  approximately  average  difficulty.  Part 
I  yielded  a  mean  difficulty  index,  corrected  for  chance,  of  approximately 
0.50,  while  for  part  II,  the  mean  difficulty,  corrected  for  chance*  was 
approximately  0.55,  based  on  the  78  cases  in  primary  training. 

(3)  Factorial  composition. — Cased  on  153  cases,  intcrcorrelations  in¬ 
volving  the  two  parts  of  this  test  were  factor  analyzed.  Part  I  (Tool 
Functions)  proved  to  be  principally  informational,  having  a  loading  of 
0.77  in  the  mechanical-experience  factor  (which  is  best  identified  by 
the  Mechanical  Information  test).  A  moderate  loading  (0.30)  in  the 
perceptual-speed  factor  and  a  slight  loading  (0.18)  in  the  spatial-rela¬ 
tions  factor  complete  the  list  of  significant  factorial  components  of  this 
part.  The  communality  for  Tool  Functions  is  0.74,  which  probably  ap¬ 
proaches  its  reliability  fairly  closely. 

Part  II  has  moderate  loadings  in  mechanical  experience  (0.42)  and  in 
perceptual  speed  (0.35),  and  lesser  loadings  in  the  verbal  (0.24),  and 
general-reasoning  (0.22)  factors.  The  communality  is  only  0.41,  which 
is  probably  far  short  of  its  reliability. 

(4)  Test  validity. — Validation  data  on  a  small  sample  (N=78)  of 
primary  pilot  trainees  revealed  unexpected  results.  Part  I  yielded  only 
moderate  validity  (rMl=0.l7),  but  part  II  yielded  a  validity  high  enough 
(rk„-0.40)  to  suggest  the  advisability  of  revising  and  revalidating  this 
part  of  the  test.  Subsequent  results  indicated,  however,  that  this  sample 
was  atypical,  since  validation  of  a  revised  form  yielded  a  biserial  of  only 
0.26  on  a  sample  of  877  pilots.  No  item  validities  were  computed  for 

this  test.  . . 

Evaluation. — Neither  part  of  this  test  yielded  stable  pilot-validity  fig’ 
ures  above  the  middle  0.20’s.  Roth  parts  are  also  highly  correlated  with 
more  valid  tests  which  appeared  in  the  classification  battery.  I- or  these 
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reasons,  neither  part  was  included  at  any  time  In  the  classification  bat¬ 
tery. 

The  pilot  validity  to  bo  expected  for  the  Tool  Functions  test  is  0.31, 
however,  and  the  test's  relative  purity  (and  high  loading  in  the  mcchan 
ieil-expcrience  factor)  attracts  favorable  attention  to  it.  In  preparing  a 
tool-functions,  section,  the  perceptual  component  should  be  minimized  by 
using  perceptually  clear  and  simple  diagrams  and  by  allowing  liberal 
working  time. 

The  pilot  validity  expected  from  the  Mechanical  Functions  test  is  0.15, 
based  on  known  factors  and  their  loadings.  The  indications,  therefore, 
are  that  there  may  be  an  unknown  factor  with  pilot  validity  in  this  test 
and  that  further  study  of  it  is  called  for.  The  difference  between  0.15 
and  0.29  is  too  large  to  be  ignored,  since  the  validity  of  0.29  was  ob¬ 
tained  from  a  composite  of  more  than  900  cases. 

Variation*  of  the  test. — Several  preliminary  and  subsequent  forms 
were  constructed  in  the  course  of  the  exploration  in  this  area. 

(1)  Mechanical  Operations  ( no  code)*. — This  is  the  original  form  of 
part  II  of  the  Mechanical  Functions  Test.  It  con.-iins  37  scored  items  of 
the  type  described  under  part  II  of  Mechanical  Functions,  CI907AX.  The 
test  was  given  experimentally  at  Psychological  Research  Unit  No.  3  to 
320  unclassified  aviation  students  on  July  9,  1942,  for  item-analysis  pur¬ 
poses.  The  items  are  moderately  easy,  the  mean  difficulty,  corrected  for 
chance,  being  0.58.  The  test  is  quite  homogeneous,  as  indicated  by  a  mean 
internal-consistency  phi  of  0.40.  The  best  15  items,  as  judged  by  difficulty, 
discriminating  value  of  alternatives,  and  internal  consistency,  were  se¬ 
lected  to  go  into  Mechanical  Functions  Test,  CI907AX. 

(2)  Tool  Function  (original  form ;  no  code). — This  test  was  also 
first  devised  as  a  separate  test,  then  combined  with  mechanical-func¬ 
tions  items  in  the  Mechanical  Functions  test  (CI907AX),  and  later  again 
divorced  as  a  separate  test.  The  original  form  contains  39  scored  items  of 
the  type  described  under  part  I  of  Mechanical  Functions,  CI907AX.  This 
form  was  given  for  experimental  purposes  in  July  1942  at  Psychological 
Research  Unit  No.  3  to  360  unclassified  aviation  students,  and  an  item 
analysis  was  made.  It  proved  to  be  easy,  the  mean  difficulty  index,  cor¬ 
rected  for  chance,  being  0.61.  The  mean  internal-consistency  phi  was  high 
(0.43)  for  the  experimental  form.  Items  selected  from  this  form  on  the 
basis  of  difficulty  and  internal  consistency  were  used  to  form  the  15-item 

part  I  of  Mechanical  Functions,  CI907 AX. 

(3 )  Tool  Function ,  C1906A. —This  is  merely  a  separate  presentation 
of  part  I,  Mechanical  Functions,  CI907AX,  previously  described. 

(4 »  Mechanical  Functions ,  CI907A.— This  form  is  a  separate  presen¬ 
tation  of  part  II.  Mechanical  Functions,  CI907AX,  with  some  very  slight 
changes  in  wording  of  alternatives  and  in  arrangement  of  item*. 
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(5)  Mechanical  Functions,  CI907B — As  a  result  of  the  promising 
validation  data  obtained  for  part  II  of  Mechanical  Functions,  CI907AX,  it 
was  deemed  advisable  to  revise  the  test  and  prepare  a  form  which,  if  the 
validity  held  up,  might  be  used  in  classification.  Twelve  items  from  part  II 
of  0907 AX  were  used,  and  22  additional  items  of  similar  type  we** 
added.  This  form  was  administered  to  classified  pilots  in  class  44C.  It 
proved  to  be  easier  than  part  II  of  the  AX  form,  the  mean  difficulty,  cor¬ 
rected  for  chance,  being  0.60.  The  items  are  quite  homogeneous,  the  mean 
internal-consistency  phi  being  0.47.  Validation  of  this  form  of  the  test  on 
a  sample  of  877  pilots  in  primary  training  yielded  a  corrected  biscria!  of 
0.26.  The  elimination  rate  for  this  sample  was  10  percent. 

Item  validation  of  this  form  showed  a  mean  phi  of  0.08. 

Mechanical  Movements,  CI904AX2 ' 

With  this  test,  an  attempt  was  made  to  measure  the  assumed  specific 
ability  to  comprehend  and  follow  the  operation  of  moving  parls  of  ma¬ 
chines.  This  test  is  similar  to  the  mechanical  movements  test  used  by 
Tl.urstone  in  his  analysis  of  primary  mental  abilities  (5). 

Description— The  test  consists  principally  of  questions  about  the 
movement  of  parts  of  machines.  The  parts  arc  pictured,  and  the  items, 
with  multiple-choice  answers  or  completing  clauses,  appear  beside  or 
below  the  drawings.  Arrows  and  letters  appear  at  appropriate  places  on 
the  drawings  to  indicate  parts  or  directions  of  movement.  Correct  an¬ 
swering  of  an  item  requires  understanding  the  interaction  of  the  parts, 


sample  problem  of  MECHANICAL  movements. 

CI904AX2 


•t>*i«lopH~it  ISKkolotUU  N*  y  O..I  con««iWU»»:  T/S^  »*•«*«  C.  »•*•» 

Cn„  N«.  J.  CW  ,-mW.  T/S,t  *-»  C.  D.rU. 


317 


which  vary  in  number  from  2  to  12.  The  sample  item  shown  in  figure 
13.5  is  typical.  The  examinee  is  required  in  this  problem,  to  select  the 
correct  alternative:  “When  X  turns  in  (A)  direction  G,  Y  turns  in  di¬ 
rection  G;  (B)  direction  G,  Y  turns  in  direction  F;  (C)  direction  F, 
Y  turns  in  direction  F;  and  (D)  direction  F,  Y  turns  in  direction  G.” 

(1)  Internal  characteristics. — The  test  consists  of  2  practice  items  and 
48  scored  items.  The  number  of  alternative  responses  ranges  from  three 
to  five  per  item,  with  an  average  of  four.  For  the  purpose  of  determining 
reliability,  the  test  is  separated  into  two  equal,  independently-timed 
parts. 

(2)  Administration. — Reading  of  the  directions  requires  approxi¬ 
mately  5  minutes.  The  sample  diagram  is  accompanied  by  two  problems 
followed  by  the  correct  answers.  Twenty  minutes  are  allowed  for  each 
part  of  the  test  proper.  In  one  sample,  approximately  60  percent  of  the 
examinees  completed  part  I,  while  only  about  35  percent  finished  part 
II.  A  larger  proportion  completed  the  next-to-the-last  item  in  each  part, 
67  percent  and  46  percent  respectively,  which  indicates  that  the  time 
is  somewhat  shorter  than  adequate  to  allow  most  to  finish. 

(3)  Scoring. — The  two  parts  of  the  test  were  scored  separately,  using 
the  formula  R— W/4.  The  part  scores  were  summed  to  give  a  total 
score  for  use  in  validation  and  in  correlating  the  test  with  other  tests. 

Statistical  results. — This  test  was  explored  quite  fully  statistically. 
The  data  given  below  are  for  examinees  at  Psychological  Research 
Unit  No.  3. 

(1)  Distribution  statistics. — Table  13.13  gives  distribution  data  for 
this  test. 


Table  13.13. —  Distribution  of  scores  for  Mechanical  Movements,  CI904AX2 


1 

Group 

Score 

N 

M 

SD 

Unclassified  aviation  student}1  ... 

R-W/4 . 

479 

22.1 

7.6 

R— W/4 . 

674 

2J.0 

7.4 

R . 

272 

22.2 

5.6 

w . 

272 

14.0 

S.4 

*  Tejtcd  February  194J. 
Mo  claw  4JK. 

•  la  claaa  44J. 


(2)  Internal  consistency.— Item  analysis  of  the  test  revealed  a  wide 
range  of  internal  consistency.  The  phi  values  had  a  mean  of  0.35  and 
standard  deviation  of  0.12.  These  data  are  based  on  the  upper  25  per¬ 
cent  and  the  lower  25  percent  of  400  unclassified  aviation  students  tested 
in  April  1943. 

Some  consideration  was  given  to  the  possibility  of  using  some  of  the 
mcchanical-movemepts  items  in  a  revision  of  the  Mechanical  Principles 
test.  In  order  to  determine  which  items  were  most  highly  correlated  with 
Mechanical  Principles,  mechanical -movements  papers  of  the  highest  and 
lowest  25  percent  groups  as  determined  by  scores  on  Mechanical  Prin¬ 
ciples  (CI903A)  were  analyzed.  Phi  values  from  this  analysis  had  a 
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lower  mean  (0.27),  but  comparison  of  the  two  analyses  revealed  no 
significant  differences  in  the  rank  order  of  discrimination.  None  of  the 
mechanical-movements  items  was  as  closely  related  to  Mechanical  Prin¬ 
ciples  total  score  as  to  Mechanical  Movements  total  score. 

(3)  Reliability  coefficient. — Reliability  of  the  test  was  estimated  by 
correlating  the  two  part  scores.  This  yielded  a  reliability  coefficient 
(corrected)  of  0.76,  based  on  an  X  of  479  unclassified  aviation  students 
tested  in  February  1943. 

(4)  Difficulty. — Item  difficulties,  corrected  for  chance,  ranged  from 
approximately  0.00  to  about  0.82  with  a  mean  of  0.52,  and  a  standard 
deviation  of  0.19  based  on  approximately  400  cases  tested  in  April  1943. 

(5)  Factorial  composition. — This  form  of  the  test  was  not  included 
in  any  factor-analysis  study,  so  no  data  as  to  its  factorial  composition 
are  available.  Factorial  composition  of  an  earlier  form  ami  comparison 
with  Thurstonc’s  findings  arc  covered  in  the  discussion  of  Form  A  of 
this  test. 

(6)  Test  validity. — On  a  sample  of  674  primary  pilot  trainees  in  cliss 
43H  the  mean  score  for  the  graduates  was  23.5  and  for  the  eliminees 
20.8.  The  standard  deviation  was  7.4,  the  proportion  of  graduates  was 
0.83,  and  the  biscrial  correlation,  0.20. 

Evaluation. — This  test  was  not  used  in  classification  for  three  im¬ 
portant  reasons.  In  the  first  place,  its  validity  is  lower  than  validities 
of  tests  ordinarily  used  in  classification.  Secondly,  it  correlates  highly 
(0.67)  with  the  composite  pilot-aptitude  score.  Thirdly,  it  also  corre¬ 
lates  highly  (0.63)  with  the  Mechanical  Principles  test,  which  yields  a 
higher  pilot  validity  than  Mechanical  Movements. 

It  is  probable  that  this  test  might  be  useful  as  a  selective  device  in 
certain  mechanical  areas  where  Mechanical  Principles  or  a  similar  test 
would  be  unsuitable.  Improvement  of  the  test  and  investigation  of  its  ap¬ 
plicability  to  other  areas  might  prove  productive. 

Venations  of  the  test.— Several  forms  of  Ibis  test  were  constructed 
prior  to  the  form  already  described. 

( 1 )  Mechanical  Movements ,  C1904X I'.— This  is  the  first  form  of  the 
test,  containing  58  items.  It  was  given  experimentally  for  the  purpose  of 
correcting  an/I  improving  the  items  and  selecting  the  most  suitable  ones 
for  a  new  form  of  the  test.  Although  the  mean  difficulty,  corrected  for 
chance,  is  alxntt  0.50.  many  of  the  items  arc  exceedingly  easy,  and  exten¬ 
sive  revision  of  others  was  required.  This  form  was  administered  to  about 
400  unclassified  aviation  students  for  item  analysis  only  in  July  1912  at 
Psychological  Research  Unit  No.  3. 

(2)  Mechanical  Movements,  CIOOtAW— This  form  is  somewhat 

harder  than  form  XI.  the  mean  item  difficulty,  corrected  for  eh.mre.  being 
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about  0.40.  There  is  considerable  range  of  difficulty  among  the  38  scored 
items,  approximately  20  percent  answering  the  hardest  item  correctly  and 
87  percent  answering  the  easiest  item  correctly.  The  test  was  admini  ’<  :cd 
to  several  squadrons  of  unclassified  aviation  students  at  Psychologic..  Re¬ 
search  Unit  No.  3  and  some  validation  data  obtained.  One  samp!",  of 
which  353  eventually  went  to  primary  pilot  training,  in  class  43D.  \  a!  ivd 
a  hiscriai  of  0.21  (/>,-=  0.77,  M,-- 14.52,  M.  =  12.09,  and  SD,=6.74).  Re¬ 
sults  of  an  item  analysis  of  this  form  were  utilized  in  selecting  items  for 
the  next  form. 

(3)  Mechanical  Movements ,  C1904A". — This  form  contains  5  prac¬ 
tice  items  and  30  scored  items  selected  from  the  experimental  forms, 
previously  referred  to,  on  the  basis  of  difficulty,  strength  of  misleads,  and 
internal  consistency.  The  test  was  administered,  along  with  several  other 
mechanical  tests,  to  one  squadron  of  unclassified  aviation  students. 

Factor  analysis  revealed  that  Mechanical  Movements  is  extremely  com¬ 
plicated  factorially.  Tabic  13.14  shows  the  principal  factor  loadings  of 
this  test,  the  loadings  of  the  Mechanical  Principles  test  in  the  same  fac¬ 
tors,  Thurstonc’s  results  on  his  Mechanical  Movements  test,  as  reported 
in  "Primary  Mental  Abilities"  (5),  and  results  of  a  reanalysis  of  Thur- 
stonc's  intcrcorrclations. 


Several  important  differences  exist  in  these  analyses;  notably  in  the 
reasoning,  mechanical,  and  visualization  aspects.  Thurstonc’s  three  rea¬ 
soning  factors  were  not  well  differentiated,  and  reanalysis  reduced  the 
number  to  two.  Analysis  of  the  mechanical  battery  revealed  no  deduc¬ 
tive-reasoning  factor  and  left  the  general-reasoning  factor  with  a  rather 
weak  loading.  It  appears  possible  that  what  Thurstonc  named  reasoning 
may  have  been  better  defined  in  the  Psychological  Research  Unit  No.  3 
rcanalysis  and  in  the  analysis  of  the  mechanical  battery  as  visualization, 
although  the  two  would  at  bc<t  be  only  roughly  equivalent.  The  im¬ 
portant  residuals  in  Tlmr, stone's  analysis  probably  account  for  much  of 
the  variance  of  Mechanical  Movements.  Much  of  this  residual  variance 
might  well  be  the  mechanical  facto;  that  appears  in  the  analysis  of  the 
Psychological  Research  Unit  No.  3  Mechanical  Battery.  Since  no  other 
mechanical  tests  appeared  in  Thurstone’s  battery,  no  common  mechanical 
factor  could  be  defined. 

As  i.  dicatcd  by  table  13.14,  Mechanical  Movements  and  Mechanical 
principles  have  much  in  common  factorially.  The  only  important  excep¬ 
tions  are  the  greater  loading  in  the  perccptual-spccd  factor,  and  the 
much  smaller  loading  in  the  mechanical-experience  factor  for  Mechanical 
Movements.  The  much  higher  pilot  validity  (0.34)  of  Mechanical  Princi¬ 
ples  than  of  Mechanical  Movements  (0  23)  is  largely  due  to  the  fact  that 
the  nicchanical-cxpericncc  factor  ir.  a  more  important  determiner  of  pilot 
validity  than  is  perceptual  speed. 


•  Dcvrlopc«9  »t 
t»«nt«r  Jf.,  T/S|i. 
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Summary  and  Evaluation  of  Mechanical  Comprehension  Tests 

As  indicated  by  the  statistical  data  for  the  tests  in  this  section,  j>osi 
live  results  were  attained  in  the  search  for  abilities  correlated  with  suc¬ 
cess  in  air-crew  training,  especially  that  of  navigator  and  pilot.  Out¬ 
standing  is  the  relatively  high  validity  of  the  Mechanical  Principles  test, 
two  forms  of  which  were  in  classification  batteries  during  most  of  the 
period  covered  by  this  report.  This  validity  results  principally  from  its 
loadings  with  the  mechanical-experience  and  visualization  factors  (espe¬ 
cially  for  the  pilot)  and  to  some  degree  from  its  loading  with  the  spatial- 
relations  factor.  The  latter  factor  also  contributes  a  considerable  propor¬ 
tion  of  the  navigator  validity  of  the  test  and  a  small  amount  of  the  lim¬ 
ited  bombardier  validity. 

Although  validities  against  air-crew  training  criteria  are  lower,  in 
general,  for  other  tests  in  this  section  than  for  Mechanical  Principles, 
sufficiently  high  validities  were  obtained  for  all  tests  to  indicate  their 
potential  usefulness  as  selection  devices  for  other  mechanical  tasks. 

Examination  of  the  factorial  composition  of  the  tests  in  this  section 
reveals  a  great  deal  of  similarity  among  them.  The  variable  most  promi¬ 
nent  in  all  these  tests  and  the  one  probably  largely  responsible  for  their 
validities  is  one  apparently  best  defined  as  mechanical  experience.  Indi¬ 
cations  are,  however,  that  differences  in  validity  among  the  tests  of  this 
group  arc  due  to  variations  in  the  total  factorial  picture  rather  than  to 
the  loadings  in  any  one  factor.  The  fact  that  validities  of  some  magni¬ 
tude  were  found  for  most  other  factors  appearing  in  these  tests,  namely, 
the  perceptual-speed  factor,  the  spatial-relations  factor,  and  the  visuali¬ 
zation  factor,  supports  this  view.  Determination  of  the  extent  to  which 
tests  of  such  complicated  factorial  composition  as  these  are  more  gener¬ 
ally  applicable  in  the  selection  of  individuals  for  mechanical  tasks  must 
depend  upon  future  research.  It  is  not  known  whether  all  mechanical 
jobs  stress  this  particular  combination  of  fundamental  abilities. 

MECHANICAL  INFORMATION 
Definition  and  Rationale 

In  this  group  are  included  tests  that  consist  principally  of  informa¬ 
tional  items  in  the  field  of  mechanics.  Most  of  the  material  is  presented 
verbally,  although  the  Driving  Skill  and  Physical  Principles  tests  utilize 
some  pictorial  material.  In  general,  the  correct  answers  to  items  in  these 
tests  cannot  be  determined  by  reasoning  but  require  specific  knowledge. 
This  is  one  respect  in  which  the  tests  in  this  group  differ  from  those  in 
the  mechanical-comprehension  section. 

The  objective  in  these  tests  is  to  evaluate  some  aspects  of  mechanical 
ability  by  measures  of  information.  Although  it  was  recognized  that  the 
possession  of  mechanical  information  alone  does  not  constitute  qualifi¬ 
cation,  per  se,  for  mechanical  tasks  such  as  those  of  air-crew  members, 
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it  appeared  likely  that  possession  of  such  information  would  be  sympto¬ 
matic  of  the  presence  of  certain  other  characteristics  essential  to  the  suc¬ 
cessful  performance  of  mechanical  tasks.  In  line  with  this  view,  it  was 
not  considered  necessary  to  construct  the  tests  in  this  section  with  any 
particular  reference  to  air-crew  functions. 

Mechanical  Information,  CI905A 11 

The  items  in  this  test  cover  information  about,  the  structure,  func¬ 
tion,  disjunction,  and  repair  of  machines.  Major  emphasis  is  placed  upon 
automotive  information,  26  of  the  30  items  being  related  to  automobiles. 
Approximately  one-half  of  the  items  are  very  brief,  as  illustrated  by  the 
following  sample  item : 

A  fuel  pump  is  driven  by  the: 

A  Flywheel 
B.  Fan  belt 
C  Generator  shaft 

D.  Gun  shaft 

E.  Distributor  shaft 

The  other  items  are  much  longer  and  cover  descriptions  of  situations  in¬ 
volving  mechanical  problems.  The  following  item  is  illustrative  of  the 
latter  type  of  item  in  the  test : 

With  pressure  on  the  starter,  the  starting  motor  runs  smoothly,  but  no  contact  b 
made  between  the  starting  motor  and  the  engine.  The  most  probable  cause  of  the 
trouble  is  that: 

A  The  armature  of  the  suiting  motor  is  loose. 

B.  The  brushes  in  the  suiting  motor  are  not  making  contact  with  the  com- 
mutator. 

G  The  bendix  spring  is  broken. 

D.  The  fuse  is  blown. 

E.  The  ignition  coil  is  not  functioning  properly. 

(1)  Administration. — The  time  allowed  is  IS  minutes.  Approximately 
85  percent  of  the  group  (unclassified  aviation  students)  finishes  the  test 
in  the  time  allotted. 

(2)  Scoring. — The  scoring  formula  is  R — W/4. 

Statistical  results. — Experimental  and  classification-battery  adminis¬ 
tration  of  this  test  yielded  numerous  statistical  data. 

(1)  Distribution  statistics.— Adminjstcred  to  more  than  3,000  avia¬ 
tion  students,  the  test  yielded  the  distributions  given  in  table  13.15. 
The  distribution  curves  are  approximately  symmetrical,  but  with  greater 
than  normal  frequencies  in  the  extreme  upper  and  lower  reaches.  It  is 
interesting  to  note  that  the  almost  identical  B  form  yielded  a  mean  of 
12.3  and  a  standard  deviation  of  7.4  for  West  Point  cadets,  using 
R— W/3  in  place  of  the  customary  fonnula,  R  —  W/4. 

»  Develop'd  it  Pijrckolojicil  Rctttrck  Unit  N*.  J.  Ck»*f  ttnitiWHn:  T/S*t  C  OnrU. 
U  Una  Hutchia **«. 
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Taslx  13.15.—  Distribution  of  scores  of  unclassified  avialion  students  on 
Mechanical  Information,  C 1905 A 


N 

u 

SD 

■1,096 

1S.1 

8.1 

*1.015 

14.1 

8.5 

'1,141 

14.S 

8.1 

*•  P»y«l>olo8ic»l  Re»tarch  Unit  No.  1  with  the  December  1942  Clarification  l.jitcry. 

’Teated  41  R*ychological  Research  Unit  No.  2  with  the  December  1942  Clarification  battery. 

•  Toted  at  Psychological  Research  Unit  No.  1  with  the  December  1942  Classification  battery. 


(2)  Internal  Consistency — Item  analysis  of  the  test,  based  on  the 
upper -25  percent  and  the  lower  25  percent  of  360  unclassified  aviation 
students  tested  at  Psychological  Research  Unit  No.  3,  revealed  a  high 
degree  of  homogeneity,  as  indicated  by  a  mean  phi  of  0.56,  a  standard 
deviation  of  0.09,  and  a  range  from  0.05  to  0.80.  From  these  data  it  may 
be  seen  that  the  test  is  one  of  the  most  homogeneous  devised  for  air-crew 
classification. 

(3)  Reliability  coefficient. — Since  most  of  the  examinees  finish  within 
the  time  limit,  the  odd-even  method  of  estimating  reliability  was  em¬ 
ployed.  Based  upon  two  different  samples  of  240  cases  each,  tested  in 
April  1943  at  Psychological  Research  Unit  No.  3,  an  average  corrected 
reliability  of  0.89  was  obtained,  the  two  figures  being  0.88  and  0.90. 

(4)  Difficulty. — The  mean  difficulty  index  of  the  items,  corrected  for 
chance,  is  0.48,  with  a  range  from  0.24  to  0.71  and  a  standard  deviation 
of  0.14,  based  on  the  above-mentioned  sample  of  360  unclassified  avia¬ 
tion  students. 

(5)  Factorial  composition. — This  test  proved  to  be  one  of  the  purest 
measures  of  any  single  factor.  Since  it  appeared  in  several  analyses,  the 
factorial  composition  can  be  considered  to  be  quite  reliably  ascertained. 
The  only  significant  loading  (0.74)  is  in  the  factor  identified  as  mechan¬ 
ical  experience.  Slight  loadings  in  visualization  (0.15)  and  verbal  (0.11) 
factors  are  found,  but  these  factors  contribute  very  little  to  the  test 

(6)  Test  validity. — Validation  data  arc  presented  in  tables  13.16 
through  13.18. 

(7)  Item  validity. — The  r.  ^an  phi  was  0.047,  with  e  standard  devia¬ 
tion  of  0.046  and  a  range  from  —0.02  to  0.10,  based  upon  a  sample  of 
800  graduates  and  600  eliminecs  from  primary  pilot  training  in  classes 
43K  through  44C,  originally  tested  at  Psychological  Research  Unit  No.  3. 

Evaluation. — From  these  data  it  may  be  concluded  that  this  test  has 
extensive  possibilities,  although  some  of  the  evidence  is  inconclusive.  It 
appears  that  this  test  is  most  useful  in  predicting  success  in  tasks  in 
which  mechanical  experience  is  highly  important.  Pilot  training,  air- 
mechanic  grades,  radio-opcrator-mechanic  average  grades  and  flexible- 
gunnery  final-examination  grades  are  in  this  class.  Why  academic  aver¬ 
ages  for  armorers  and  flexible  gunners  are  not  more  highly  correlated 
with  scores  on  this  test  is  not  entirely  clear.  It  seems  probable,  however, 
that  such  academic  grades  are  more  heavily  weighted  with  non-mechani- 
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Table  13,17. —  Validation  data  for  Mechanical  Information,  CJ905A,  for  seven 
grades  in  navigation  training 1 


Criterion 

r 

.02 

.03 

.05 

-.02 

.01 

tirades  in  celestial  navigation  (flight)  . . . . . 

.06 

.09 

I  .00 

.06 

Oradei  in  meteorology  . .  7* . ' . . . . . 

1  For  a  (ample  of  463  navigation  trainee*  in  Hondo  classes  41-10  through  43— IS.  tested  at 
Psychological  Research  Units  Nos.  1,  2.  and  3. 


Table  13.18. —  Validities  of  Mechanical  Information,  CI90SA,  for  technical 

specialties 


Specialty 


Air  mechanic1  . . 

Armorer* . 

Radio  operator-mechanic* 

Flexible  gunnery* . 

Flexible  gunnery* . 

Flexible  gunnery*  ...... 

Flexible  gunnery* . 

Flexible  gunnery* . 

Flexible  gunnery* . 


Criterion 

N 

I* 

Average  grades  . . - . 

232 

0.49 

Average  grades . 

376 

.19 

Average  grades  . . 

Air-to-rir*  . 

153 

.33 

61 

.00 

Air-to-air*  . 

194 

-.10 

Air-to-air*  . 

173 

.15 

Academic  average . 

61 

.00 

Final  exa.nination . 

194 

.31 

Final  examination . 

173 

.36 

*  Product-moment  correlationa.  _ 

’Tested  at  Psychological  Research  Units  Nos.  1,  2,  and  3  with  the  December  1942  Battery. 
1  In  classes  43-27  to  43  -30  at  Tyndhll  Tested  at  Psychological  Research  Unit  No.  1. 

4  A  very  unreliable  criterion.  _  .  _  .  _ 

*  in  class  43-45  at  Ft.  Myers.  Tes'od  St  Psychological  Research  Units  Nos.  1,  2,  and  3. 

*  In- class  43-41  at  Ft.  Myers.  Tested  at  Psychological  Research  Units  Nos.  1,  2,  and  3. 


cal  material  than  arc  the  tasks  for  which  higher  validities  were  obtained. 
This  test  is  a  relatively  pure  measure  of  the  mechanical-experience  fac¬ 
tor  and  would  probably  be  useful  in  predicting  success  in  most  other 
mechanical  tasks. 

Variations  of  the  test. — Considerable  attention  was  given  to  this  area 
with  the  result  that  several  forms  were  constructed. 

(1)  Mechanical  Knowledge  (no  code)11. — This  is  the  first  form  of 
the  mcclnnical-information  type  ot  test.  It  contains  40  scored  items,  27  of 
which  are  directly  related  to  automobile  or  airplane  engines.  The  test  was 
administered  in  July  1942  at  Psychological  Research  Unit  No.  3  to  360 
unclassified  aviation  students,  and  an  item  analysis  was  made.  This  form 
proved  to  be  homogeneous  (mean  internal-consistency  phi =0.47),  hut  is 
somewhat  too  easy.  The  mean  difficulty  index,  corrected  for  chance,  is 
0.58.  On  the  basis  of  the  item  analysis,  several  of  the  items  were  selected 
for  use  in  Form  A  of  mechani.\d  information,  already  described. 

(2)  Mechanical  Information  C1905AXI1*. — This  is  the  original  form 
by  this  name.  It  contains  25  items  about  automobiles,  including  diagnosis 
of  trouble  (20  items)  and  functioning  of  the  machine  and  its  parts  (5 
items).  This  form  :s  moderately  difficult  (mean  difficulty  index,  corrected 

n  Develop'd  at  Psychological  Research  Unit  No.  3.  Chit!  contributors:  T/Sgt.  Paul  C  Darla, 
U  Linn  Hutchinson. 

**Samt  m  footnote  12. 
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for  chance,  —  0.45  for  an  N  of  350  student:  tested  at  Psychological  Re¬ 
search  Unit  No.  3).  The  test  is  quite  homogeneous,  yielding  a  mean  in¬ 
ternal-consistency  phi  of  0.41  with  a  standard  deviation  of  0.12,  for  the 
above-mentioned  sample.  The  number  of  items  in  this  form  was  con¬ 
sidered  inadequate  in  view  of  the  revisions  expected.  The  item-analysis 
data  were  helpful  in  preparing  the  revised  form. 

(3)  Mechanical  Information ,  Cl905AX2n. — This  35-itcm  form  is  a 
revision  of  Form  AX1  with  additional  items  of  the  same  type  described 
under  AX1.  It  was  administered  tor  item-analysis  purposes  to  400  un¬ 
classified  aviation  students  in  June  1942  at  Psychological  Research  Unit 
No.  3.  The  mean  phi  was  0.38,  with  a  standard  deviation  of  0.16  and  a 
range  from  0.09  to  0.80.  The  mean  difficulty  of  the  items,  corrected  for 
chance,  is  0.44,  and  the  standard  deviation  of  difficulty  values  is  0.20. 

Fifteen  of  the  items  from  this  test  that  yielded  high  internal-consis¬ 
tency  phis  and  satisfactory  difficulty  were  selected  for  use  in  Mechanical 
Information,  CI905A. 

(4)  Mechanical  Information ,  CI90SBX. — This  is  a  two-choice  form 
of  CI905A,  prepared  for  the  purpose  of  studying  item  reliability.  Based 
upon  the  item  analysis  of  C19Q5A,  the  correct  answer  and  the  mislead  with 
the  highest  discriminating  value  were  selected  for  use  in  this  form.  Item 
analysis  based  upon  experimental  administration  of  the  test  at  Psycho¬ 
logical  Research  Unit  No.  3  in  November  and  December  1943  to  800  un¬ 
classified  aviation  students  yielded  a  mean  internal-consistency  phi  of  0.47 
and  a  standard  deviation  of  0.08. 

Although  the  mean  phi  values  for  this  form  are  lower  than  for  Form 
A,  these  data  show  that  relatively  high  internal  consistency  (hence  reli¬ 
ability)  can  be  achieved  even  with  two-alternative  items,  if  misleads  are 
strong  discriminators  between  good  and  poor  groups  as  determined  by 
total  score  on  the  test 

(5)  Mechanical  Informationt  C1905D . — This  is  a  second  classification 
form  of  mechanical  information  and  differs  from  C1905A  only  slightly. 
Some  alternatives  were  revised,  omitted,  or  rearranged,  but  otherwise 
little  change  was  made  in  preparing  this  form. 

Driving  Skill,  CI307AX1  ** 

Because  of  the  similarity,  superficial,  at  least,  of  the  tasks  involved  in 
driving  an  automobile  or  truck  and  in  flying  an  airplane,  it  appeared 
logical  to  expect  that  a  measure  of  success  in  the  former  would  consti¬ 
tute  a  good  predictor  of  success  ir.  the  latter.  It  could  be  assumed  that 
experience  with,  and  hence  knowledge  of,  the  operation  of  automobiles 
might  be  indicative  of  interest  in  and  certain  aptitude  for  mechanical 
tasks. 


“  Derr*  ”  Fsr c I o jfi c..i  Research  Unit  No.  i.  CVtf  c<meikutor»:  T/Sft  F»«l  C 

Davit,  Major  Jamti  /  Gibson,  CaW.  U  G.  )lum|>kr«r(,  U.  U»  HuuWnao*.  A»  mlUv 
form,  called  “Automobile  Driving  Teat***  *»i  developed  Vjf  Majof  Neal  E»  jIuMf. 
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It  was  originally  reasoned  that  a  test  of  driving  skill  might  measure 
a  type  of  judgment  that  is  important  in  flying.  On  the  basis  of  this  as¬ 
sumption,  this  test  v/as  given  a  Judgment  code  number.  As  will  be  seen 
later  in  the  discussion,  this  assumption  did  not  prove  to  be  correct. 

Description. — The  teit  consists  of  42  scored  items,  31  presented  ver¬ 
bally,  and  11  presented  by  means  of  pictures,  representing  situations  in 
which  the  examinee  is  required  to  indicate  the  best  decision  or  driving 
practice.  The  following  item  and  figure  13.6  are  typical  of  the  two  kinds 
of  items. 
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If  one  front  tire  is  softer  than  the  other,  the  car  wilt  tend  to 

A.  Pull  to  the  side  of  the  soft  tire. 

B.  Skid  in  the  direction  of  the  soft  tire.  - 

C.  Pull  away  from  the  side  of  the  soft  tire. 

D.  Skid  in  the  opposite  direction  from  the  soft  tire. 

E.  Pull  from  one  side  of  the  road  to  the  other. 

(1)  Internal  characteristics. — The  test  is  separated  into  two  parts  of 
21  items  each.  Each  part  is  timed  separately  (15  minutes)  in  order  to 
provide  a  basis  for  estimating  reliability. 

(2)  Administration. — Because  of  the  unusual  problems  in  this  test, 
it  was  necessary  to  include  a  special  paragraph  in  the  directions.  This 
paragraph  warns  against  answering  according  to  legal  rules  and  urges 
the  examinee  to  answer  in  line  with  best  driving  practice  without  reference 
to  traffic  regulations.  Standard  directions  for  answering  and  marking 
answer  sheets  arc  also  included.  The  time  allowed  permits  approximately 
80  percent  of  the  students  to  finish  the  test.  This  is  quite  satisfactory, 
inasmuch  as  the  test  is  designed  as  a  power  *est. 

(3)  Scoring. — Most  of  the  items  contain  five  alternative  responses, 
but  a  few  have  only  three  or  four.  The  test  is  scored  R— W/4. 

Statistical  results. — This  test  appeared  in  a  battery  of  tests  which 
underwent  rather  thorough  statistical  analysis.  Except  where  noted  to 
the  contrary  the  data  are  based  upon  examinees  tested  in  December  1942 
at  Psychological  Research  Unit  No.  3. 

(1)  Distribution  statistics. — A  sample  of  202  unclassified  aviation 
students  yielded  a  mean  score  of  18.6  and  a  standard  deviation  of  5.4. 
The  distribution  of  scores  is  symmetrical,  but  slightly  flatter  than  normal. 

(2)  Internal  consistency. — An  internal-consistency  item  analysis  of 
the  test,  based  on  the  upper  25  percent  and  the  lower  25  percent  of  202 
unclassified  aviation  students,  yielded  a  mean  phi  of  0.28,  a  standard 
deviation  of  0.11,  and  a  range  from  0.08  to  0.58.  Twelve  items  have  phi 
values  below  0.20.  Examination  of  these  items  reveals  two  probable  rea¬ 
sons  for  their  lack  of  internal  consistency.  In  some  instances  certain 
alternatives  are  general  and  cover  other  more  specific  alternatives  also, 
leaving  the  best  (not  to  say  correct)  answer  questionable.  In  many  cases, 
also,  the  correct  answer  is  poorly  presented  or  answering  is  dependent 
upon  correct  interpretation  of  the  situation.  Almost  half  (five)  of  the 
pictorial  items  are  in  the  low-phi  group. 

(3)  Reliability  coefficient.— By  the  alternate- forms  method  (pt.  I-pt 
II),  the  reliability  is  0.55,  corrected  for  length,  based  upon  a  sample  of 
239  unclassified  aviation  students. 

(4)  Difficulty.—  The  difficulty  levels  of  the  items  based  on  the  item 
analysis  previously  referred  to  is  indicated  by  a  mean  proportion  of 
correct  responses,  corrected  for  chance,  of  0.47,  a  standard  deviation  of 
0.21,  and  a  range  from  0.00  to  0.85. 

(5)  Factorial  composition. — The  loading  in  the  mechanical-experi¬ 
ence  factor  was  found  to  be  0.46  anil  that  in  visualization  0.42.  A  small 
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loading  (0.15)  also  is  found  in  spatial  relations.  The  test  appears  to 
measure  about  the  same  things  as  the  mechanical  tests  (Mechanical 
Principles,  CI903A,  and  Mechanical  Movements,  0904 A),  but  much 
less  reliably.  Its  communality  of  0.53  almost  completely  exhausts  its 
nonerror  variance. 

(6)  Test  validity. — Based  upon  a  relatively  small  sample  of  149 
pilots  in  primary  training  in  class  43J,  the  test  yielded  a  biscrial  validity 
of  0.12.  The  mean  of  the  graduates  was  20.34,  that  of  the  cliniinccs 
19.34,  and  the  standard  deviation  of  the  total,  4.92.  Of  this  sample,  75 
percent  were  graduates. 

(7)  Item  validity.- -A  study  of  item  validity  for  pilots,  based  upon 
200  graduates  and  45  eliminecs  from  primary  training,  showed  validity 
phis  ranging  from  —0.13  to  0.28  with  a  mean  of  0.03  and  a  standard 
deviation  of  0.09.  This  sample  was  tested  in  November  1943. 

Evaluation. — This  test  was  included  in  the  first  foresight-and-pian- 
ning  battery,  being  considered  at  that  time  to  be  a  measure  of  planning 
and  judgment.  Factor  analysis  of  the  intercorrelations  revealed  that, 
contrary  to  the  original  hypothesis,  the  driving  skill  test  is  most  heavily 
loaded  in  the  mechanical-experience  and  visualization  factors  rather  than 
in  planning. 

From  its  factors  and  their  loadings  one  would  expect  a  pilot  validity 
of  0.30  for  this  test.  The  obtained  composite  validity  of  0.32  (sec  table 
28.18)  is  very  close  to  this  expectation. 

Its  factorial  content  indicates  that  any  use  to  which  it  might  be  adapted 
could  be  better  performed  by  other  mechanical  tests.  Certain  items  that 
correlate  highly  with  a  mechanical-information  test  score  and  low  with 
a  visualization  test  score,  might  well  be  incorporated  in  a  test  of  the 
mechanical-experience  factor.  Others  that  prove  to  be  valid  for  pilots 
might  be  incorporated  in  a  heterogeneous  general-information  test.  In 
view  of  the  apparent  duplication  in  this  test  of  the  functions  measured 
by  other  mechanical  tests,  it  was  not  deemed  profitable  to  develop  this 
test,  as  such,  further  for  pilot  selection. 

Physical  Principle**  0801 BX  " 

The  dependence  of  r'l  mechanical  phenomena  upon  basic  laws  of  phy¬ 
sics  was  viewed  as  be  ng  sufficient  justification  for  a  test  of  technical 
physics,  at  least  experimentally.  It  was  reasoned  that  if  mechanical  abil¬ 
ity  should  prove  valid  for  the  prediction  of  air-crew  success,  knowledge 
of  the  basic  principles  upon  which  mechanics  depend  might  also  be 
valid.  It  also  appeared  that  knowledge  of  the  correlations  of  mechanical 
tests  with  a  test  of  formal  physics  would  be  valuable  in  analyzing  the 
results  and  evaluating  the  usefulness  of  such  tests.  One  fact  tending  to 

“  M  hnk»l(|Kil  Rwefli  Vm'l  K*  S.  Cliff  contributor!:  U  Join  A.  Bilk. 
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indicate  that  a  test  of  formal  physics  might  not  be  valid,  at  least  for  pilot 
selection,  was  the  approximately  zero  correlation  of  academic  intelligence 
(verbal  ability)  with  pilot  success  in  primary  training.  All  in  all,  these 
considerations  suggested  that  a  measure  of  knowledge  of  academic  phy¬ 
sics  should  be  constructed  and  the  results  of  its  application  thoroughly 
examined. 

Description. — Tins  test  contains  30  items,  28  of  which  are  presented 
verbally  and  2  diagrammatically.  The  items  arc  based  predominantly 
upon  principles  and  laws  of  physics  which  would  be  fairly  familiar  to  a 
student  who  has  just  completed  a  course  in  high-school  physics.  The 
electricity  items  are  generally  at  a  relatively  simple  level.  Previous  forms 
of  the  test  had  provided  indications  that  electricity  items  of  greater  dif¬ 
ficulty  were  so  hard  as  to  be  practically  useless  in  discriminating  be¬ 
tween  good  and  poor  groups,  since  few  were  able  to  answer  them. 

(1)  Administration. — The  time  allowed  is  18  minutes. 

(2)  Scoring. — The  test  is  scored  R—W/4. 

Statistical  results. — Fuller  statistical  results  arc  available  for  this  than 
for  other  forms  of  the  test.  The  data  are  for  examinees  tested  at  Psy¬ 
chological  Research  Unit  No.  3. 

(1)  Distribution  statistics. — Distribution  data  are  given  in  table  13.19. 


Table  13.19. —  Distribution  data  for  Physical  Prim.-' fit j,  CI&OIBX 


Gr**#  | 

X 

u 

to 

UndiuiM  aviation 

Ml 

11.* 

M 

5,401 

IM 

7.1 

•  Tot  Ml  ia  October  and  N«»«nker  1*41 
»  In  cIuki  44  B  and  WC  Tottd  in  Nnurtw  IMJ. 


(2)  Internal  consistency —As  is  true  for  most  of  the  tests  in  this 
area,  the  items  of  this  test  proved  to  be  highly  cohesive.  The  mean  in¬ 
ternal-consistency  phi  based  upon  the  upper  25  percent  and  lower  25 
percent  of  800  unclassified  aviation  students  tested  in  November  1943  is 
0.51,  and  the  standard  deviation  is  0.09.  The  vange  of  values  is  relatively 
small  (0.32  to  0.66). 

(3)  Reliability  coefficient— \n  view  o.'  the  fact  that  the  test  is  essen¬ 
tially  a  power  test,  the  odd-even  method  of  estimating  reliability  was 
employed.  Based  cm  the  above-mentioned  sample  of  368  (sec  table  13.19). 
the  estimated  reliability,  corrected  for  length,  is  0.86. 

(4)  Difficulty. — The  mean  difficulty  index  of  the  items,  based  on  the 
sample  referred  to  previously  (N  =  368),  is  0 .44.  corrected  for  chance. 
The  standard  deviation  of  the  distribution  of  corrected  difficulties  is 
0.15.  The  range  of  corrected  difficulties  (0.14  to  0.72)  indicates  adequate 
variety  of  difficulty  for  the  group  being  tes*cd. 
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(5)  Factorial  composition. — The  highest  loadings  of  the  test  are  in 
the  mechanical-experience  (0.51)  and  verbal  (0.38)  factors.  No  others 
exceeded  0.20.  The  commonality  of  0.46  is  far  below  the  test  reliability. 

(6)  Test  validity. — This  test  was  validated  against  the  primary  grad¬ 
uation-elimination  criterion  in  a  sample  of  5,408  pilots  in  classes  44B 
and  44C,  of  whom  53  percent  graduated.  Graduates  had  a  mean  score  of 
18.96,  and  climinccs  a  mean  of  17.40.  The  standard  deviation  of  the 
total  was  7.08,  and  the  biserial  correlation  was  0.15,  corrected  to  an  as¬ 
sumed  unrestricted  stanine  standard  deviation  of  1.81. 

Evaluation. — This  test  measures  mechanical  experience  but  less  purely 
than  several  other  tests,  particularly  the  mechanical-information  tests.  Its 
obtained  validity  (see  Table  78.18)  fell  slightly  short  of  its  expected 
validity  of  0.15  for  pilot  selection. 

A  test  such  as  this  might  prove  very  useful  in  selecting  and  classify¬ 
ing  for  tasks  in  which  both  academic  intelligence  and  mechanical  knowl¬ 
edge  play  an  important  part.  The  homogeneity  and  the  substantial  verbal 
loading  suggest  its  possible  use  in  predicting  success  in  technical  or  en¬ 
gineering  studies. 

Variations  of  the  test. — Several  other  forms  of  physics  tests  were 
prepared. 

(1)  Physics,  Cl 801  A”. — This  is  a  preliminary  30-item  form  that  was 
constructed  for  administration  in  August  1942  at  Psychological  Research  j 
Unit  No.  3  with  the  experimental  mechanical  battery.  The  test  consists  of 
verbally-presented  technical  physics  items  and  was  administered  to  about 
250  unclassified  aviation  students.  It  proved  to  be  too  hard,  the  mean  diffi¬ 
culty  index,  corrected  for  chance,  being  approximately  0.34,  and  the  ; 
standard  deviation  being  0.19.  Factoriaiiy,  this  form  bears  close  similarity 

»o  Reading  Comprehension,  CI614H,  described  in  chapter  .5,  its  loadings 
in  the  verbal  (0.68),  visualization  (0.25),  and  reasoning  (0.1 7)  factors 
differing  only  slightly  from  those  of  Reading  Comprehension  in  the  same 
factors  (060,  0.30,  and  0.19  respectively).  The  loading  in  mechanical 
experience  is  only  0.21,  as  compared  with  0.51  in  the  B  form. 

(2)  Physical  Principles ,  CI801AX. — This  form  contains  a  total  of  102 
items  and  constitutes  an  attempt  to  bring  together  a  large  number  of  all 
types  of  physics  problems,  from  which  a  shorter  form  could  be  prepared. 
This  form  proved  to  be  of  approximately  the  same  difficulty  as  the  30- 
item  preliminary  form,  the  mean  difficulty,  corrected  for  chance,  being 
0.35  and  standard  deviation  being  0.19.  This  form  is  quite  homogeneous, 
yielding  a  mean  internal-consistency  phi  of  0.43  and  a  standard  deviation 
of  0.13,  based  upon  360  cases  tested  in  August  1942  at  Psychological  Re¬ 
search  Unit  No.  3.  Selection  of  items  for  the  0801 BX  form  was  made 

*'  Dtrrioprd  .1  Ptychotovcsl  Rciorck  Unit  .<«.  J.  CV*f  contributor*  :U.  UwU  G.  C*f- 
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on  the  basis  of  (l)  suitable  difficulty,  (2)  availability  of  functional  mis¬ 
leads,  and  (3)  appropriate  internal-consistency  phi  values. 

Summary  and  Evaluation  of  Mrehanienl  Information  Tests 

The  tests  included  in  this  group  have  one  outstanding  characteristic  in 
common;  they  call  for  knowledge  of  specific  facts.  The  specificity  of 
these  facts  resulted  in  quite  diverse  factorial  content  among  the  various 
tests.  The  outstanding  example  is  the  relatively  high  loading  in  the 
mechanical-experience  factor  achieved  by  the  Mechanical  Information, 
and  Driving  Skill  tests,  while  the  Physics  test,  on  the  whole,  was  most 
heavily  loaded  in  the  verbal  factor.  The  relatively  high  pilot  validity  of 
the  mechanical-experience  factor  made  the  tests  high  in  this  factor  most 
useful  in  pilot  selection.  The  homogeneity  of  most  of  the  tests  in  this 
group  suggests  that  each  of  the  tests  might  be  used  to  good  advantage  in 
situations  where  specificity  of  function,  such  as  that  involved  in  the  test, 
is  required.  These  tests  would  probably  be  less  useful  for  general  pre¬ 
dictive  purposes  than  those  included  in  the  mechanical-comprehension 
section. 


A  FACTOR  ANALYSIS  i>F  MECHANICAL  TESTS 

Although  mechanical  tests  have  been  in  use  for  some  time  in  civilian 
life,  as  indicated  in  the  introduction  to  this  chapter,  little  was  known 
about  their  unique  components  or  factors.  After  the  preparation  of  a 
battery  of  mechanical  tests,  it  appeared  desirable  to  factor  analyze  intcr- 
corrclations  among  these  and  certain  other  tests.  These  other  tests  were 
added  in  order  to  obtain  a  comprehensive  picture  of  the  mechanical  tests 
by  bringing  out  as  many  of  the  factors  as  possible. 

The  Data 

This  analysis  included  17  tests  which  covered  many  phases  of  human 
ability.  Of  these  tests,  seven  were  designed  as  strictly  mechanical  tests, 
three  involve  length  estimation  (Nearest  Point,  Shortest  Path,  Shorter 
Line),  and  three  were  designed  along  with  mechanical  tests  but  were 
later  found  to  be  quite  different  (Physics,  Pattern  Assembly,  Pattern 
Comprehension).  The  remaining  four  (Reading  Comprehension,  Arith¬ 
metic  Reasoning,  Spatial  Orientation,  Complex  Coordination)  represented 
different  areas  covered  by  the  classification  tests.  All  these  tests,  with 
the  exception  of  Complex  Coordination — an  apparatus  test — and  Me¬ 
chanical  Comprehension,  AC10B  and  AC10D,  arc  described  elsewhere  in 
this  volume.  The  last-named  tests  arc,  superficially,  at  least,  quite  simi¬ 
lar  to  the  Mechanical  Functions  test.  They  were  parts  of  the  two  AAF 
qualifying  examinations  with  the  code  numbers  given  (see  report  No.  6). 

The  lisi  of  tests  and  intcrcorrrlaiions  appears  in  table  13.20.  The  cor¬ 
relations  are  based  on  a  sample  of  153  unclassified  aviation  students. 
Table  13.21  presents  centroid  loadings  and  rotated  factor  loadings. 
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The  Factors 

In  this  section  the  factors  arc  described,  and  the  tests  appearing  on 
each  are  listed  in  order  of  decreasing  saturation.  Only  rotated  factor 
loadings  of  0.25  and  above  are  given. 

Rotated  factor  I  is  defined  by  the  following  data : 


Test  No. 

T**t  Mm* 

trading 

J 

Mechanical  Information  . 

0  7* 

4 

Tool  Function . 

.77 

1 

Mechanical  Principlea . 

.54 

n 

Mechanical  Comprehenaion,  AC10B  . 

.4] 

u 

Mechanical  Comprehenaion,  AC10D  . 

.42 

12 

Mechanical  Functiona . . . 

.42 

2 

Mechanical  Movement  a . . . 

.38 

This  is  the  mechanical-experience  factor.  At  one  time  it  was  desig¬ 
nated  mechanical  information,  but  this  and  other  analyses  tend  to  indi¬ 
cate  that  experience  or  background  are  better  terms,  since  many  tests 
appearing  on  the  factor,  such  as  Mechanical  Principles  and  Mechanical 
Movements,  seem  to  depend  more  upon  general  mechanical  experience 
than  upon  specific  knowledge  of  things  mechanical.  Not  a  single  mechan¬ 
ical  test  in  the  battery  failed  to  have  a  substantial  loading  on  this  factor. 
On  the  other  hand,  tests  like  Pattern  Comprehension  and  Pattern  Assem¬ 
bly,  traditionally  believed  to  be  in  the  mechanical  area,  are  conspicuous 
by  their  absence.  If  they  are  valid  for  predicting  success  in  mechanical 
tasks,  it  is  because  of  some  other  factor  than  this  one.  It  is  believed  that 
those  other  factors  have  been  identified  in  this  and  other  analyses  re¬ 
ported  in  this  volume. 

Rotated  factor  II  is  defined  by  the  following  data : 


Teat  No. 

Teat  suit 

Loading 

11 

2 

0. 62 

J7 

5 

JJ 

f 

.31 

4 

.30 

A 

.29 

This  is  the  perceptual-speed  factor  defined  in  other  analyses  by  Speed 
of  Identification  as  well  as  Spatial  Orientation.  Two  of  the  mechanical 
tests,  which  involve  following  operations  of  parts,  have  moderate  load¬ 
ings  in  this  factor  but  neither  has  a  high  loading.  Pattern  Assembly  (a 
paper  form  board)  and  Pattern  Comprehension  (surface  development) 
also  appear  on  this  factor,  although  the  loadings  indicate  that  they  are 
not  primarily  perceptual  tests.  The  absence  of  Mechanical  Information 
from  this  list  was  to  be  expected.  The  presence  of  the  Tool  Function  test 
must  mean  that  the  diagrams  in  that  test  were  too  detailed,  or  the  element 
of  speed  was  somehow  stressed  too  much,  or  both  features  share  the 


blame  for  the  loading  on  perceptual  speed.**  Since  this  finding  has  not 
been  verified  in  a  second  analysis,  however,  not  too  much  concern  should 
be  felt  about  explaining  it  or  about  ridding  the  test  of  it. 

Rotated  factor  III  is  define^  by  the  following  data: 


Test  No. 

Test  mm 

Loading 

9 

Reading  Comprehension  . 

071 

S 

Physic*  . . . 

.68 

II 

Mechanical  Comprehension,  ACIOT)  . . 

.IS 

13 

.25 

12 

Arithmetic  Reasoning  . 

.25 

This  is  the  verbal  factor,  which  has  been  well  defined  in  several  an¬ 
alyses  discussed  elsewhere  in  this  volume.  The  loading  of  0.68  for 
Physics  in  this  factor  and  its  loading  of  0.21  in  mechanical  experience 
emphasize  the  great  factorial  difference  between  this  test  and  the  me¬ 
chanical  tests  in  general.  The  two  mechanical-comprehension  tests  ap¬ 
pearing  on  this  factor  display  more  characteristics  of  vcr'jat  tests  than 
do  the  tests  descrilicd  in  this  chapter.  The  absence  of  the  Mechanical  In¬ 
formation  test  from  the  list  is  most  eloquent  of  the  possibility  of  ex¬ 
clusion  of  undesired  factors  from  tests.  Its  items  are  entirely  verbally 
presented,  and  yet  the  level  of  verbal  comprehension  is  apparently  so 
low  that  individual  differences  in  the  trait  do  not  influence  scores  in  the 
test.  The  test  of  Driving  Skill  that,  in  another  analysis,  had  a  high  me¬ 
chanical  variance  but  zero  verbal  variance  is  another  good  example. 

Rotated  factor  IV  is  defined  by  the  following  data: 


Test  No. 

Test  name 

Loading 

IZ 

0.61 

7 

.52 

16 

.50 

15 

Shortest  Path  . . . . 

.46 

This  is  a  length-estimation  factor,  which  is  quite  well  defined  by  the 
three  quantitative- percept  ion  estimation  tests  (see  ch.  18).  Conscious 
effort  had  been  made  to  introduce  length  estimation  into  the  Pattern  As¬ 
sembly  test,  and  its  loading  of  0.52  on  this  factor  indicates  that  solution 
of  the  problems  depends  to  a  considerable  extent  on  this  factor.  The  fac¬ 
tor  should  possibly  Ik-  defined  as  a  more  general  “size-estimation"  ability, 
because  it  must  be  recognized  that  the  tests  listed  represent  different 
kinds  of  space  judgments — length  of  lines  (straight  lines  or  irregularly 
curved  lines),  gaps  between  points,  and  gaps  between  lines.  Thus,  it 
would  seem  that  not  only  one-dimensional  but  also  two-dimensional  ex¬ 
tents  arc  discriminated  in  the  tests  loaded  with  this  factor.  It  is  possible 
that  the  more  complex  discriminations  rest  upon  combinations  or  abstrac- 

■Ai  »  mailer  of  fad.  ike  itulrid  form  of  Oii%  teat  •»»*  dieoff M*ke4, 

■vatic  reproduction*  were  poor  wner«  detail*  wet#  important. 


337 


-atrr.f*  ■ 


’JPfP1  v. 


tions  of  linear  judgment,  however.  Since  the  test  with  the  greatest  load¬ 
ing  is  Line  Length,  also,  we  arc  probably  justified  in  choosing  ‘actor 
title  given — length  ''‘’nation. 

Rotated  factor  V  is  defined  by  the  following  data : 


Test  No. 

Test  name 

ioaijini  •’ 

a 

Mechanical  Movement!  . 

0.51 

1 

Mechanical  Principle*  . . . 

.40 

11 

Mechanical  Comprehension,  AC10D  . 

.40 

IS 

Shortest  Path  . . 

.12 

6 

Pattern  Comprehension  . 

.28 

10 

Mechanical  Comprehension,  AC10B  . 

.27 

$ 

.2S 

This  is  the  visualization  factor,  which  also  appeared  in  several  other 
analyses.  It  seems  to  be  a  very  common  secondary  factor  in  mechanical 
tetts,  particularly  in  those  that  involve  moving  mechanisms.  This  gives 
one  small  clue  as  to  the  nature  of  the  factor  which  has  been  tentatively 
defined  as  a  manipulatory  visualization.  Objects  are  imagined  as  moving 
or  as  having  been  moved  or  transformed  in  tests  loaded  with  this  factor. 

Rotated  factor  VI  is  defined  by  the  following  data : 


Test  No. 

Teat  name 

Loading 

14 

SAM  Complex  Coordination .  ..... 

0.52 

10 

Mechanical  Comprehension,  ACI0B  . 

.36 

1 

Mechanical  Principles  . 

.29 

a 

Mechanical  Movement!  . . . 

.28 

14 

Shortest  Path  . . . . 

.27 

This  is  the  spatial-relations  factor,  long  defined  by  the  Complex  Co¬ 
ordination  test.  This  factor  appears  generally  in  pictorial  mechanical  tests 
but  not  in  verbal  tests,  as  might  be  expected.  The  spatial  arrangement 
of  mechanical  devices  seems  to  be  the  significant  element  bringing  this 
about 

Rotated  factor  VII  is  defined  by  the  following  data : 


Teat  No. 

Test  name 

Loading 

12 

Arithmetic  Reasoning  . 

0.56 

15 

Nearest  Point  . . . . . . . 

.47 

6 

Pattern  Comprehension  . 

.45 

to 

Mechanical  Comprehension,  ACtOB  . . . . 

.32 

16 

Shortest  Path . . . 

.26 

2 

Mechanical  Movement#  . 

.25 

This  appears  to  be  the  general-reasoning  factor  isolated  in  other  anal¬ 
yses.  The  loadings  of  Arithmetic  Reasoning  Pattern  Comprehension,  and 
Mechanical  Comprehension,  particularly,  clearly  agree  with  this  naming 
of  the  factor.  The  loading  of  0.47  for  Nearest  Point  is  very  difficult  to 
rationalize.  The  magnitude  of  the  loading  may  be  a  sampling  artifact, 
however,  since  the  correlation  of  the  test  with  Arithmetic  Reasoning  for 
a  larger  sample  (N— 392)  was  only  0.10  as  compared  with  0.36  in  this 
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matrix.  The  heterogeneous  list  of  tests  is  not  unusual  for  this  tdvior. 

General  reasoning — most  constant  component  of  arithmetic-reasoning 
tests — shows  up  in  many  places.  The  ability  is  apparently  a  kind  of 
“all-purpose"  or  “trouble-shooting”  trait,  which  is  called  into  the  picture 
when  more  immediate  comprehension  is  lacking. 

Conclusions 

The  principal  contribution  of  this  analysis  consists  in  the  better  defi¬ 
nition  of  mechanical  tests.  Important  also  is  the  additional  information 
concerning  factors  important  in  these  tests  in  common  with  others. 

Taking  a  quick  survey  of  the  results,  we  see  that  there  is  an  important 
single  factor  common  to  printed  mechanical  tests  and  also  unique  in 
them.  It  seems  clearly  to  represent  a  variance  in  previous  mechanical  ex¬ 
periences  as  reflected  most  clearly  in  information  tests.  A  strong  sec¬ 
ondary  factor  in  pictorial  mechanical-comprehension  tests  is  visualiza-  , 

tion.  Other  factors  with  substantial  loadings  are  spatial  relations  and 
perceptual  speed.  Deviations  from  the  general  pattern  arc  the  tests  Me¬ 
chanical  Movements  (strongest  in  visualization),  Physics  (strongest  in  t 

the  verbal  factor),  Pattern  Assembly  (strongest  in  length  estimation),  * 

and  Pattern  Comprehension  (strongest  in  general  reasoning,  though  it  J 

probably  need  not  be).  ! 

All  the  factors  in  the  list  for  the  mechanical-battery  analysis  are  valid 
for  pilot  selection  except  for  the  verbal  and  general-reasoning  factors. 

When  any  test  combines  valid  factors,  its  scores  are  hound  to  yield  ex¬ 
ceptionally  good  predictions.  If  one  desired  the  most  univocal  represen¬ 
tative  of  mechanical  experience,  however,  he  would  choose  one  of  two 
tests:  Mechanical  Information  and  Tool  Function.  The  other  valid  factors 
in  mechanical  tests  can  be  better  assessed  by  means  of  nonmechanical 
tests.  The  unique  contribution  of  tests  in  this  area,  therefore,  is  the  me¬ 
chanical-experience  factor.  A  factor  discovered  in  the  analysis  of  the  I 

mechanical  battery  is  that  of  length  estimation,  which  will  be  met  again 
in  another  chapter. 

BIBLIOGRAPHY 

(1)  Bennett,  G.  K.  and  Cruikshank,  R.  M.  A  Summary  of  Manual  and  Mechan¬ 

ical  Ability  Tests  (Preliminary  Form),  New  York:  The  Psychological  Cor¬ 
poration,  1942. 

(2)  Cox.  J.  W.  Mechanical  Aptitude:  Its  F.xislence,  Nature,  and  Measurement, 

London,  England:  Machuen,  192& 

(3)  Harrell,  T.  W.,  A  Factor  Analysis  of  Mechanical  Ability  Tests,  Psycho- 

melriko,  1940,  5,  17-33. 

(4)  Stenquist,  J.  L.  Measurement  of  Mechanical  Ability,  Teachers  College  Con¬ 

tributions  to  Education,  1923,  No.  130. 

(5)  Thurstone,  L.  L.  Primary  Mental  Abilities,  Psychometric  Monograph  No.  1, 

Chicago:  University  of  Chicago  Press,  19 & 


339 


CHAPTER  fOOSTEEf _ , 

Information  Tests1 


RATIONALE  OF  INFORMATION  TESTS 

Why  Information  Tests  Are  Important 

Information  tests  are  not  by  any  means  a  new  type  of  test.  Their  use 
as  direct  measures  of  achievement  is  a  long-standing  tradition.  Only  in 
recent  years,  however,  has  their  value  for  indirect  measurement  of  hu¬ 
man  traits  been  demonstrated.  It  is  becoming  recognized  more  and  more 
that  what  a  person  knows  or  docs  not  know  can  be  used  to  reveal  a 
number  of  things  concerning  his  personal  background.  Since  he  is  to  * 
large  extent  a  product  of  his  personal  experience,  and  since  what  he  is 
bodes  good  or  ill  concerning  his  future  status  in  one  respect  or  another, 
knowledge  scores  promise  to  have  predictive  value.  It  was  therefore  de¬ 
cided  to  exploit  information  tests  as  potential  predictors  of  the  success 
of  aviation  students. 

Knowledge  as  an  indication  of  motivation. — Job  descriptions  of  air¬ 
crew  positions  and  studies  of  students  in  training  indicated  that  interest 
or  high  motivation  is  important  for  success.  It  was  hypothesized  that  the 
possession  of  information  about  a  certain  area  usually  accompanies  in¬ 
terest  in  that  area,  and  so  information  tests  were  constructed  as  measures 
of  interest.  The  quantitative  use  of  self-rating  measures  of  interest,  such 
as  are  given  by  the  Training  Interests  Blank,  CE501E,  in  which  the  stu¬ 
dent  rates  his  degree  of  interest  in  each  type  of  air-crew  training,  was 
considered  undesirable  because  of  the  strong  subjective  clement.  On  the 
other  hand,  it  was  thought  that  some  graduated  evaluation  of  a  candi¬ 
date’s  strength  of  interest  should  enter  into  the  composite  score  used  in 
qualifying  or  disqualifying  him  for  training.  Information  tests  were 
accordingly  intended  to  provide  more  objective  and  reliable  measures 
of  interest 

Knowledge  as  an  indication  of  skills. — An  aviation  student  who  has 
psychomotor  aptitudes  and  skills  that  arc  useful  in  flying  will  be  more 
likely  to  succeed  in  training  than  a  student  who  lacks  such  abilities.  In¬ 
formation  tests  were  therefore  constructed  to  measure  the  extent  to  which 
candidates  have  had  experiences  that  develop  motor  skills  pertinent  to 
the  air-crew  jobs.  Trade  tests  first  developed  by  the  United  States  Army 
and  subsequently  by  the  United  States  F.mploymcnt  Service  use  infor- 

•Wrium  by  Suff/S«t.  Fnttiur. 
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mation  tests  to  measure  levels  of  worker  skill  in  each  of  numerous  voca¬ 
tions.  From  a  study  of  biographical  background  data  it  had  been  found 
that  participation  in  certain  sports  and  hobbies  was  prognostic  of  air¬ 
crew  success.  The  measurement  of  these  experiences  more  objectively 
than  in  terms  of  personal  statements  called  for  something  like  information 
tests. 

Knowledge  that  may  transfer  to  flying. — A  few  information  tests 
were  designed  purely  as  measures  of  knowledge.  The  aviation  student 
who  possesses  information  directly  pertinent  to  a  specific  job  or  who  has 
relevant  background  information  so  that  he  can  more  easily  acquire  such 
knowledge  will  be  at  an  advantage  in  air-crew  training. 

Content  of  Testa  to  Be  Considered 

To  recapitulate,  this  chapter  deals  with  tests  that  use  informational 
items  to  measure  interest,  previously  acquired  skills,  experiences  perti¬ 
nent  to  air-crew  jobs,  and  specific  knowledge.  These  measures  of  interest 
or  of  pertinent  experience  or  knowledge  fall  into  two  categories:  (1) 
sports-and-hobbies  tests  and  (2)  general-information  tests.  Tests  of 
mechanical  information  are  treated  in  the  chapter  on  mechanical  tests 
(ch.  13).  Another  special  group  of  information  tests  is  described  in  con¬ 
nection  with  the  assessment  of  masculine  vs.  feminine  attitudes  in  chap¬ 
ter  25. 


SPORTS  AND  HOBBIES  PARTICIPATION  TESTS 
Earlier  Use  In  Aviation 

During  the  first  World  War,  the  British  gave  weight  to  participation 
in  various  sports  and  hobbies  in  their  selective  process  for  military  per¬ 
sonnel.  The  United  States  Army  did  exploratory  work  in  this  area  in 
1917-1918.  In  this  war,  results  for  the  Navy  Biographical  Inventory  sug¬ 
gested  a  satisfactory  predictive  value  of  biographical  sports-and-hobbies 
questions, 

Ran  of  Research 

A  tentative  list  of  areas  to  be  included  in  the  sports-and-hobbies  tests 
was  constructed,  utilizing  four  sources  of  information: 

(1)  The  Navy  blank,  a  study  of  which  had  revealed  that  certain  avo- 
cational  activities  were  significant  in  pilot  selection. 

(2)  Job  analyses  of  the  duties  of  bombardiers,  navigators,  and  pilots, 
which  indicated  that  participation  in  certain  sports  and  hobbies  should 
have  differential  predictive  value. 

(3)  Hypotheses  based  upon  a  priori  psychological  considerations  as 
to  which  sports  and  hobbies  should  be  significant 

(4)  Information  yielded  by  the  results  of  the  administration  to  158 
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aviation  students  of  the  Sports  and  Hobbies  Check  List,  CE506X,  of 
extent  of  participation  in  50  sports  and  hobbies.* 

Method  of  development. — From  ail  sources,  information  concerning 
the  sports  and  hobbies  to  be  included  was  synthesized.  Activities  were 
included  only  if  there  was  either  evidence  or  a  sound  rationale  for  their 
significance  and  if  they  were  of  the  kind  normally  participated  in  by 
many  air-crew  candidates. 

AH  questions  developed  for  sports-and-hobbies  tests  were  of  the  five* 
alternative  information  type.  The  first  four  alternatives  always  included 
the  correct  answer  and  distractors.  The  fifth  answer  was  always  "Don't 
know."  This  was  included  to  give  the  examinee  an  acceptable  way  of  ex¬ 
pressing  ignorance  of  the  correct  answer  and,  consequently,  avoiding  to 
some  extent  the  chance  element  involved  in  guessing. 


Sports  and  Hobbies  Participation  Test,  CE506D  and  CE506E* 

These  are  two  equivalent  forms  of  the  test.  Each  form  includes  100 
items  matched  with  those  of  the  other  form  for  statistical  similarity  and 
content 


These  questions,  with  many  more,  had  been  given  in  a  preliminary 
form,  CE506C,  to  examiners  who  entered  pilot  classes  44 F  and  44G 
for  item  analysis  and  item  validation,  and  were  selected  for  inclusion  in 
these  equival-  nt  forms  on  the  basis  of  the  following  considerations: 

(1)  The  tvtrachoric  correlation  between  passing  or  failing  the  item 
and  being  in  the  highest  27  percent  or  lowest  27  percent  of  the  total- 
score  (for  a  given  activity)  distribution.  The  minimum  acceptable  value 
of  r  was  +0.60.  (There  were  a  few  exceptions  to  this  general  rule.) 

(2)  The  percentage  of  responses  to  the  most  popular  misleads  by  the 
upper  and  lower  groups.  If  not  over  20  percent  of  those  in  the  upper 
27  percent  of  the  total-score  distribution  selected  an  incorrect  answer, 
the  item  was  included. 

Table  14.1  gives  the  number  of  items  for  each  activity  included  in 
both  CE506D  and  CE506E. 

Tabu  14.1. —  Distribution  of  questions  included  in  rack  form  of  the  S for  Is  a mi 
Hobbies  Participation  Test  (CE506D  and  CF.506E) 


Group  1  (5  item*  each)  : 
Automobile  Driving. 
Flying. 

Model  Plane  Building. 
Photography. 

Radio. 

;«*- 


Group  2  (4  item*  each) : 
Basketball 
T-nniv 
Firearms. 

Hunting. 

Motorcycling. 


•TW  SO  Ktlnliu  af«  (Mttll,  ba.krlkiR.  Vi«UD,  Itowt.  r*M.  Wwtlof,  SlSIrtf, 

tailm*.  Of.,*.  k«rwln(k  r»l>nc.  oHv  fcunlin*.  owl*!  flin*  ktuklw.  plipn 
■KlUt,  MlwcTdint,  itrM  iliMtint,  Mm'iilMi,  Jintim,  akrlrkin^  IM  r'  *"*•. 

(rnMiitf  inj  dr<rlof»n*),  nlWtiAi  muk,  Vann*,  (ni  pont,  a;  *W*I  r*cv»|.  >hw(. 

Mill  rkturr  (kMtfiipkr  (prmliitp  tod  klD’tViH.  rkrx,  cmpi..  ••Unl'M  W*** 

frrwJv  drimiiui.  frltin*.  kikinp,  Irark  mirttmimkir.  pool.  ra«lio,  k*  kottf.  Watb*» 
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Group  3  (3  items  each)  : 
Bowling. 

Football 

Coif, 

Pool. 

Field  Events.  \ 
Literature. 

Poker. 

Skiing. 

Sailing. 

Stamp  Collecting. 
Diving. 


Group  4  (2  items  each): 
Chess. 

Riding. 

Woodworking. 

Bridge. 

Dramatics. 

Music  appredatioa 
Wrestling. 

Boxing. 

Group  S  (1  item): 
Piano. 


Two  representative  items  are: 

To  "draw,"  a  pool  player  hits  the  cue  ball: 

A.  At  the  right. 

B.  At  the  left 

G  High. 

D.  Low. 

2.  Don’t  know. 

To  keep  his  opponent  from  getting  a  combination  reverse  hcadlock  and  arm  bar, 
a  wrestler  should  keep  hit: 

A.  Elbows  away  from  his  side. 

B.  Elbows  against  his  side. 

C  A.rms  in  front  of  him. 

D.  Head  down. 

E.  Don’t  know. 


Keying. — Empirical  pilot  and  navigator  keys  were  constructed  on  the 
basis  of  item  validations.  Two  pilot  keys  were  constructed  for  each  form 
of  the  test  by  separating  the  odd  testing-number  papers  from  the  even 
testing-number  papers,  thereby  obtaining  two  samples  for  each  form  of 
the  test  on  which  to  do  item  validation.  For  form  CE506D  the  odds 
group  was  composed  of  422  graduates  and  169  eliminees  from  primary 
training,  and  the  evens  group,  of  366  graduates  and  170  eliminees.  For 
form  CE505E  there  were  288  graduates  and  143  eliminees  in  the  odds 
group,  and  311  graduates  and  146  eliminees  in  the  evens  group.  Exami¬ 
nees  were  in  classes  431,  43J,  and  43K,  and  they  had  been  tested  at  Psy¬ 
chological  Research  Unit  No.  1.  Responses  whose  tetrachoric  correlation 
with  primary  pass-fail  data  is  greater  than  ±0.12  and  whose  level  of 
difficulty  falls  between  90  percen  'ul  10  percent  are  weighted.  In  scor¬ 
ing,  the  odds  keys  were  used  to  sco.c  the  papers  of  the  evens  samples, 
and  vice  versa.  The  same  procedure  was  followed  in  obtaining  navigator 
keys,  except  that  only  one  form  of  the  test  (CE506D)  was  used.  In  the 
odds  group,  there  were  302  graduates  and  30  eliminees;  in  the  evens 
group,  300  graduates  and  30  eliminees.  Examinees  were  in  class  44-9 
and  had  been  tested  at  Psychological  Research  Unit  No.  1.  The  key, 
however,  was  developed  at  Psychological  Research  Unit  No.  2.  Re¬ 
sponses  whose  correlation  (phi  coefficient)  with  graduation-elimination 
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from  navigator  training  is  greater  thart  ±0.11  and  whose  level  of  diffi¬ 
culty  is  between  85  percent  and  15  percent  arc  weighted.  Here,  also,  the 
odds  key  was  used  to  score  the  papers  of  the  evens  sample,  and  vice 
versa. 

Validity. — The  keys  were  tested  in  the  manner  described  above,  the 
scoring  formula  for  the  pilot  keys  being  R — \V/3,  and  for  the  navigator 
key,  both  R  —  \V/3  and  R  —  W.  In  these  formulas,  R  refers  to  re¬ 
sponses  of  positive  validity  and  YV  to  responses  of  negative  validity.  The 
formula  R— YV/3  is,  of  course,  designed  to  correct  for  chance  success; 
the  formula  R  —  \V  assays  the  preponderance  of  responses  of  positive 
validity  over  those  of  negative  validity.  The  resultant  biscriai  validities 
for  each  pilot  key  with  the  elementary  pass-fail  criterion  arc  given  in 
table  14.2.  The  biscriai  validities  for  each  navigator  key  with  gradua¬ 
tion-elimination  from  navigator  training  is  given  in  table  14.3. 


Table  14.2. —  Validity  of  scoring  keys  of  Sports  and  Hobbies  Participation  Test, 
CE506D  and  506E,  for  graduation-elimination  from  primary  training 


Form 

Sample 

Key 

N, 

'e 

M 

# 

M. 

SD, 

fl(« 

.'si.* 

CES06D 

Odds  .. 

Evens*  . 

591 

0.71 

10.16 

6.38 

6.37 

0.36 

0.38 

CE506D 

Evens  . . 

Odds*  . 

536 

.68 

4.78 

1.30 

6.73 

42 

45 

CE506E 

Odds  .. 

Evens*  . 

432 

.67 

5.11 

1.29 

7.53 

.31 

44 

CE506E 

Evens  .. 

Odds*  .. 

456 

.68 

15.06 

11.53 

7.03 

40 

42 

1  Corrected  to  an  unrestricted  alanine  standard  deviation  of  2.00. 


•The  Iter  contains  40  responses  of  positive  validity  and  31  of  negative  validity. 
•The  key  contains  33  responses  of  positive  validity  and  38  of  negative  validity. 
•The  key  contains  38  tesponses  of  positive  validity  and  45  of  negative  validity. 
•The  key  contains  45  responses  of  positive  validity  and  24  of  negative  validity. 


Table  14.3. —  Validity  of  scoring  keys  of  Sports  and  Hobbies  Participation  Test, 

CE506E,  for  navigator  samples 


Sample 

Key* 

M. 

SD, 

f*ll 

Odds 

Evens* 

3.93 

0.75 

4.21 

0.39 

0.39 

Evens 

Odds* 

6.29 

3.19 

4.79 

43 

.34 

Odds 

Evens* 

10.08 

7.86 

3.08 

47 

46 

Evens 

Odds* 

12.83 

10.41 

3.50 

.35 

.36 

•  In  the  odds  key  there  are  32  responses  of  positive  validity  and  29  of  negative  validity.  In  the 
evens  key,  the  corresponding  figures  arc  31  and  26. 

•Score  is  R-W. 

•Score  is  R-W/3. 

•Assuming  an  unrestricted  stanine  standard  deviation  of  2.00, 


Validity  of  specific  avocations. — Scores  for  Ihe  various  activities  in 
form  CE506C  were  validated  separately  for  pilots  in  classes  44F  and 
44G,  who  had  been  tested  at  Psychological  Research  Unit  No.  1.  The 
score  for  the  purposes  of  this  validation  is  the  number  correct.  The  valid¬ 
ities  arc  shown  in  table  14.4. 
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Tauuc  14.4- —  y  alula  lion  Jala  for  tbs  sub- tests  of  Sports  and  Hobbies  Participation 
Tests ,  CF,506C ,  for  graduation-elimination  of  pilots  from  primary  training 


Sub-tot 

N, 

M, 

M, 

SD, 

Auto  driving  . 

371 

0.62 

14.87 

12.44 

3.86 

0.39 

KaiVetball  . . 

314 

.70 

10.47 

10.1! 

5.46 

.04 

Bowling  . 

270 

.67 

5,75 

6.34 

4.57 

-.08 

Diving  . 

5'0 

.66 

4.21 

3.89 

3.32 

.08 

Dramatic*  . 

164 

.58, 

4.74 

5.29 

3.09 

-.11 

Firearm  . 

436 

.63 

11. OS 

9.87 

4.72 

.15 

Flying  information  A  .... 

sis 

.54 

9.44 

6.90 

4.91 

,32 

Flying  information  B . 

374 

.SO 

10.83 

v  7.71 

5.73 

.34 

rontiiall  . 

30S 

.60 

3.91 

9.51 

5.73 

-.06 

Golf  . 

386 

.60 

8.88 

8.89 

7.97 

—.00 

Horieback  Riding  . . . 

270 

,67 

6.83 

7.42 

3.97 

-.09 

Hunting  . 

486 

.65 

9.28 

8.02 

5.56 

.14 

Jat*  . 

308 

.69 

5.89 

6.76 

5.69 

-.09 

Model  plane*  . 

118 

.68 

3.94 

3.58 

3.28 

,04 

Motorcycling  . . 

302 

.64 

1.64 

3.20 

3.22 

.08 

Mutic  . 

118 

.68 

1.52 

2.27 

2.35 

-.1* 

Photography  . 

311 

.62 

2.79 

3.59 

4.29 

-.12 

Poker  . 

747 

.62 

7.55 

7.47 

4.53 

.01 

Poor  . 

466 

.57 

12.80 

11.83 

5.49 

.11 

Radio  . . . . 

302 

.64 

2.70 

2.91 

3.78 

-.03 

Reading  . 

287 

.63 

9.39 

10.21 

4.71 

-.11 

Tennia  . 

432 

.62 

7.71 

7.36 

4.47 

.05 

Woodworking . 

311 

.68 

7.76 

7.21 

3.48 

.10 

Use  of  the  cheek-list. — A  basic  question  in  the  development  of  any 
participation  test  is  whether  the  test  actually  measures  participation. 
Several  types  of  evidence  indicate  that  the  sports-and-hobbies  tests  do 
measure  participation  in  the  activity  about  which  they  are  concerned. 
Some  information  concerning  this  problem  is  given  in  results  from  a 
check-list 

(1)  Check-list ,  CE506X. — This  instrument  was  administered  along 
with  Sports  and  Hobbies  Participation  Test,  CE506C.  The  responses 
to  every  item  were  validated  against  the  graduation-elimination  criterion. 
The  sample  used  comprised  students  of  whom  2,052  were  later  pilot 
graduates  and  1,421  were  pilot  eliminecs  from  primary  training  (classes 
44 F  and  44G). 

The  instructions  for  the  chcck-list  directed  the  student  to  indicate  the 
extent  of  his  participation  in  different  activities  according  to  the  follow¬ 
ing  scale: 

(a)  Never  participated. 

( b )  Rarely  participated. 

(c)  Occasionally  participated. 

(d)  Frequently  participated. 

(e)  Participated  to  a  great  extent. 

An  examination  of  the  results  showed  that  no  single  alternative  yields 
a  high  validity.  An  examination  of  the  differences  in  correlations  from 
the  Icas^  degree  to  the  greatest  degree  of  participation,  however,  reveals 
a  definite  trend.  The  results  are  given  in  table  14.5. 

(2)  Keying  of  check-list,  CE506X. — A  scoring  key  for  the  check¬ 
list  items  was  prepared  on  the  basis  of  the  item  validation.  The  key 
derived  from  the  odd-testing-numbers  validation  was  used  to  score  the 
papers  of  the  even-numbered  sample  and  vice  versa.  The  odds-sample 
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key  contained  11  weighted  items,  and  the  evens  contained  11  items.  The 
scoring  formula  was  the  number  of  positively  weighted  activities  checked 
minus  the  number  of  negatively  weighted  activities  checked.  The  validities 
are  reported  in  table  14.6. 


Tabu  14,6. —  Validity  of  Sports  and  Hobbies  Cheek  List,  CE506X,  scoring  keys 

for  pilots  in  primary  training 


Cmp 

Key 

N, 

b 

M* 

SD, 

fll* 

Odd* . . 

Even*  . 

Even*  . . 

Odd* . 

1,711 

1.740 

0.60 

.ss 

0.64 

-2.77 

-0.27 

-1.67 

2.1$ 

2.S2 

0.26 

.20 

0.26 

.21 

1  Connitd  M  an  unrestricted  eUflint  standard  deviation  of  2.00. 


(3)  Validation  of  active  vs.  sedentary  activities. — The  avocations  of 
check-list  CE506X  were  separated  into  those  requiring  considerable 
physical  activity  and  those  of  a  sedentary  nature.4  Students  indicated  their 
participation  on  the  five-point  scale  referred  to  in  the  previous  section 
from  “never  participated"  to  “participated  to  a  great  extent”  and  were 
given  a  cumulative  active-sports-participation  score  and  a  cumulative 
sedentary-activities-participation  score.  The  results  are  shown  in  table 
14.7. 


Tabu  14.7.—  Validity  data  for  participation  in  active  sports  and  participation  in 
sedentary  activity  as  determined  by  Check  List,  CE506X,  for  graduation-elimination 


from  primary  training  for  two  samples  of  pilots 


GfMp 

", 

t 

M. 

Ol 

O 

r»l# 

Active  tparts: 

Odd*  population  .... 

1711 

0.60 

26.79 

2S.1S 

S.U 

0.15 

_  .  Even*  population  .. 

1740 

.$« 

26.61 

2S74 

6.02 

.09 

Sedentary  activities  t 

Odd*  population  ... 

1711 

.60 

16.S2 

17.07 

170 

-.09 

Even*  population  .. 

1740 

.50 

16.4* 

17.04 

1  Ai 

-.»• 

They  indicate  that  trainees  who  have  engaged  in  sports  and  hobbies 
involving  physical  activity  have  a  slightly  greater  chance  of  graduation 
from  primary  training  than  do  those  who  have  engaged  in  sedentary 
activities,  though  the  relationship  is  very  low  and  much  less  than  that  for 
an  empirically  developed  key. 

Conclusions. — Knowledge  about  certain  activities  (automobile  driv¬ 
ing,  flying,  hunting,  firearms,  diving,  model-plane  building,  and  sailing) 
is  related  to  success  in  pilot  training.  These  are  all  outdoor,  active  avo¬ 
cations  requiring  the  use  of  the  body  in  various  types  of  coordination. 
These  activities  tend  to  preclude  group  partcipation,  being  in  each  case 
highly  individual  for  the  principal  participant.  Most  of  the  activities 
require  the  use  of  equipment,  the  possession  or  use  of  which  appears 

«Tfc«  MIn  aport*  Inc:  :ded  fcnkftUn.  fcowtiitf,  driving  Sell  cvrntt,  fiiStnc,  football,  tottne. 
ridin*.  hunting,  motartydm*.  Milio*.  »ku»g,  and  wre»Um*.  TV*  aedenUry  activities 
included  knd««,  Oeu,  j  a**,  wuc  appieciatioo,  pi  to*,  poVer,  rad*,  rtadia*.  and  stamp  caUectia*. 
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to  indicate  average  or  better  socio-economic  status.  Whether  this  aspect, 
as  such,  contributes  to  validity  is  unknown. 

Knowledge  concerning  a  second  group  of  activities  (basketball,  horse* 
back  riding,  poker,  pool,  skiing,  tennis,  and  track)  appears  to  bear  some 
relationship  to  the  pilot  criterion,  though  the  validities  of  the  subtest 
scores  and  of  specific  questions  are  not  great.  With  the  exception  of  poker 
and  pool,  these  avocations  are  also  active,  outdoor,  activities.  Poker  and 
pool  may  be  valid  because  they  indicate  masculine  avocational  interests. 

A  third  group  of  activities  (jazz,  musical  appreciation,  piano,  pho¬ 
tography,  radio,  and  stamp  collecting)  yields  slightly  negative  validities 
against  the  pilot  graduation  criterion.  These  activities  are  largely  sed¬ 
entary  and  some  require  thinking,  rather  than  the  use  of  the  body,  in 
various  types  of  coordination,  and  they  are  usually  carried  on  indoors 

A  fourth  group  of  activities  (boxing,  bowling,  football,  woodworking, 
and  wrestling)  has  no  relationship  with  the  criterion.  This  lack  of  rela¬ 
tionship  may  be  due  in  part  to  the  following  difficulties  encountered  in 
test  construction : 

(a)  The  boxing  and  wrestling  questions  proved  to  be  generally  un¬ 
satisfactory.  This  was  in  part  anticipated  during  the  initial  item  con¬ 
struction  because  of  regional  differences  in  terminology,  rules,  and  prac¬ 
tices  for  amateur,  college,  and  professional  events. 

(b)  The  bowling  and  woodworking  items  were  not  satisfactory,  prob¬ 
ably  because  of  geographical  variations  in  popularity,  together  with  cer¬ 
tain  terminological  variations  which  made  difficult  the  framing  of  ade¬ 
quate  questions.  In  bowling,  for  example,  duckpins  and  tenpins  are  pop¬ 
ular  in  different  sections  of  the  country.  Woodworking  might  well  be 
associated  with  a  range  of  activities  varying  from  rough  carpentering  to 
cabinet  making. 

(c)  The  football  items  were  unsatisfactory  because  of  the  difficulty  in 
developing  differentiating  questions.  Questions  about  flying  that  can  be 
answered  only  by  those  who  have  participated  in  the  activity  are  rela¬ 
tively  easy  to  construct  and  are  meaningful,  whereas  in  football  such 
questions  are  not  easy  to  write.  The  unusual  popularity  of  football  as  a 
spectator  sport  makes  difficult  a  clear  discrimination  between  participant 
and  spectator. 

Reading,  or  knowledge  of  literature,  consistently  yields  a  very  low 
negative  correlation  with  the  pilot  criterion,  regardless  of  the  specific 
method  used  in  measuring  participation  therein.  This  negative  relation¬ 
ship  shows  that  absorption  in  this  sedentary,  abstract  avocation  is  not 
conducive  to  success  in  pilot  training. 

Generally,  questions  about  how  a  given  end  is  achieved,  when  to  use  a 
spectfic  approach,  or  how  a  thing  works,  are  superior  to  questions  about 
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rules  or  regulations.  It  is  interesting  to  note  that  this  differentiation  may 
reflect  a  difference  between  participation  in  and  observation  of  an  ac¬ 
tivity. 

Apparently,  also,  the  test  may  be  keyed  with  equal  success  for  the 
navigator-training  criterion.  Unfortunately,  no  detailed  hobby-by-hobby 
analysis  is  available  for  navigators. 

A  sports  and  hobbies  test,  then,  appears  to  be  a  successful  selection 
instrument.  A  detailed  item  analysis,  however,  is  required  before  such 
a  test  can  be  used  for  predicting  any  single  criterion  variable. 

Disposition. — Items  from  the  Sports  and  Hobbies  Participation  Test 
were  incorporated  in  both  the  AAF  Qualifying  Examination  and  the 
General  Information  Test,  CE505E,  of  the  classification  battery. 

GENERAL  INFORMATION  TESTS 

This  group  of  tests  covers  the  fields  of  technical  information  that 
would  be  acquired  by  those  having  the  interests  appropriate  to  potential 
success  in  air-crew  positions.  These  interests  are  in  the  areas  of  avia¬ 
tion  information,  mechanical  information,  active  sports,  navigation, 
astronomy,  gunnery,  etc.  The  items  were  constructed  to  indicate  partici¬ 
pation  in,  rather  than  book  knowledge  of,  these  subjects. 

Technical  Vocabulary  Teat,  CE505C  * 

This  is  the  first  form  of  general-information  test  included  in  the  clas¬ 
sification  battery,  which  it  entered  in  July  1942. 

Description.  (1)  Internal  characteristics. — The  test  is  made  up  of 
five-choice  vocabulary  and  information  items.  Certain  items  arc  con¬ 
cerned  with  planes,  plane  identification,  flying  techniques,  etc.,  and  they 
yield  a  score  for  pilot  information.  Others  arc  concerned  with  astron¬ 
omy,  instruments,  maps,  etc.,  and  they  give  a  score  for  navigator  infor¬ 
mation.  Still  others  deal  with  guns,  bomb  sights,  trajectories,  etc.,  and 
yield  a  score  for  bombardier  information.  Of  the  100  items  in  the  test, 
40  arc  pilot  items,  40  arc  navigator  items,  and  20  arc  bombardier  items. 
The  following  sample  items  are  taken  from  the  pilot,  navigator,  and 
bombardier  sections  of  the  test,  respectively: 

Th;  plane  with  a  cannon  In  its  nose  is  manufactured  by: 

A  Be  a. 

B.  Boring. 

C  Sikorsky. 

D.  Douglas. 

EL  Value. 

*  fV»«-Uf*d  M  Piftli  'lot R  .Mirth  Unit  N*.  I.  Chi.f  cMirihutMii  MaJ.  *-  It.  m4 

lx  Cti  )*tui  5.  ThUti-.ti. 
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Time  is  usually  calculated  with  reference  to: 

A.  The  Naval  Observatory  in  Washington. 

B.  Zero  degrees  latitude. 

C.  Greenwich. 

D.  The  International  Date  Line. 

E.  The  League  of  Nations’  Observatory  in  Geneva. 

The  extent  of  scatter  of  bombs  around  a  target  is  usually  expressed  in  terms  of: 

A.  Angles  of  divergence.  | 

B.  Yards.  f 

C  Probable  error.  > 

D.  The  Error  Scatter  Pattern.  | 

E.  Concentric  circles  of  error.  I 

(2)  Administration. — Directions  arc  printed  on  the  test  booklet,  so  * 

the  test  is  largely  self-administering.  The  time  allowed  for  the  test  is  * 

36  minutes,  divided  into  3  parts.  After  12  minutes,  all  subjects  are  in¬ 
structed  to  go  ahead  to  part  II  even  if  they  have  not  finished  part  I; 

and  after  24  minutes  they  arc  instructed  to  go  ahead  to  part  III. 

(3)  Scoring. — There  arc  three  scores,  one  each  for  the  pilot,  navigator, 
and  bombardier  set  of  items.  The  scoring  formula  for  each  of  the  three 
subtests  is  R— W/4. 

Statistical  results  ( pilot  score).— Results  will  be  presented  separately 
for  the  pilot,  bombardier,  and  navigator  sections  of  the  test.  The  pilot 
score  will  be  treated  first 

(1)  Distribution  statistics.— Distribution  data  are  shown  in  table  14A 


Table  14.8. —  Distribution  data  for  Technical  yocobuiary,  CE505C,  pilot  score, 
for  samples  of  unclassified  aviation  students 


(2)  Internal  consistency  —  The  internal  consistency  of  items  in  the 
pilot  score  is  indicated  by  a  mean  phi  of  0.36,  with  a  range  from  0.00 
to  0.75,  and  a  standard  deviation  of  0.15,  based  on  the  highest  27  percent 
and  the  lowest  27  percent  of  360  pilots  in  classes  43K  and  44C,  who  had 
been  tested  at  Psychological  Research  Unit  No.  3. 

(3)  Reliability  coefficients.— Two  estimates  of  reliability,  corrected 
for  length,  were  comparable,  as  indicated  in  table  14.9. 
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Table  14.9.—  Reliability  of  Technical  Vocabulary,  CE50SC,  pilot  score,  for  jam  fits 

of  pilots 


'  Tcxed  at  Psychological  Research  Unit  No.  2.  Gasses  4}K  and  44  C. 

'  Tested  at  Psycholcgicat  Research  Unit  No.  J,  November  1942.  Gasset  4  !K  and  44C 
•  Median  of  1  r’s  corrected  to  triple  length. 


(4)  Difficulty  level. — The  difficulty  level  of  pilot  items  is  indicated 
by  a  mean  proportion  of  correct  responses  equal  to  0.56,  corrected  for 
chance  success.  The  proportions  range  from  0.17  to  0.97  with  a  standard 
deviation  of  0.17.  These  data  arc  based  upon  results  from  400  pilots  in 
classes  43K  and  44C,  who  had  been  tested  at  Psychological  Research 
Unit  No.  3. 

(5)  Factorial  composition. — The  pilot  score  was  analyzed  with  the 
December  1942  classification  battery  (N  =3,000).  Substantial  loadings 
were  obtained  on  the  verbal  (0.41),  pilot-interest  (0.34),  and  mechanical- 
experience  (0.39)  factors.  Inconsequential  loadings  were  obtained  on 
the  perceptual-speed,  numerical,  spatial-relations,  visualization,  general¬ 
reasoning,  and  psychomotor-coordination  factors.  Its  commonality  in  the 
battery  was  0.47,  which  is  considerably  short  of  its  reliability.  For  a  full 
description  of  the  factorial  composition  of  this  test,  see  appendix  B. 

(6)  Test  validity. — Table  14.10  includes  typical  validities  for  the 
pilot  score. 

(7)  Item  validity. — The  validity  of  responses  to  pilot  score  is  indi¬ 
cated  by  a  mean  phi  of  0.03.  The  range  of  phis  is  from  —0.05  to  0.16 
with  a  standard  deviation  of  0.04.  These  data  arc  based  upon  responses 
of  800  graduates  and  600  climinccs  from  elementary  pilot  training 
(classes  43K  and  44C;  originally  tested  at  Psychological  Research  Unit 
No.  3),  assuming  a  p,  of  0.50.  In  general,  the  items  that  have  the  best 
predictive  value  deal  v.lth  plane  identification,  technical  terms  related  to 
flying,  technical  names  of  planes  or  plane  parts,  and  experiences  related 
specifically  to  flying.  Items  that  have  low  or  negative  value  deal  with 
historical  events,  the  names  of  scientific  inventors,  sports  records  and 
events,  and  "book  learning"  in  general. 


Table  1411.—  Distribution  data  for  Technical  Votabulary ,  CE5C5C,  bombardier 
score,  for  samples  of  unclassified  aiiution  students 


Dal*  of  letting 


r>rcbele«ical 
rrtcarch  anil  Ki 


|.04j  December  1942  . 

2.374  September  and  October  1942 

I.01S  December  1942  . . 

|,14J  December  l°42  . 
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Statistical  results  ( bombardier  score). — Extensive  data  are  available 
for  the  bombardier  score  also. 

(1)  Distribution  statistics. — The  data  are  shown  in  table  14.11. 

(2)  Internal  consistency. — The  internal  consistency  of  bombardier 
items  is  indicated  by  a  mean  phi  of  0.35  with  a  range  from  0.18  to  0.53 
and  a  standard  deviation  of  0.09,  based  on  the  highest  27  percent  and  the 
lowest  27  percent  of  total  scores  of  360  pilots  in  classes  43K  and  43C, 
originally  tested  at  Psychological  Research  Unit  No.  3.  A  mean  phi  of 
approximately  0.25  would  be  expected  by  chance  in  a  test  of  this  length 
(20  items). 

(3)  Reliability  coefficients. — Two  estimates  of  reliability,  corrected  for 
length,  indicate  that  the  bombardier  score  is  very  unreliable.  See  table 

14.11 


Table  14.12. — Reliability  of  Technical  Vocabulary,  CE5CJC,  bombardier  score,  for 


samples  of  pilots 


Tm 

N 

'a 

Odd-rrca  . 

>209 

•js« 

0.J7 

*.47 

CorrcUiioa  *f  third*  . . 

*  TfU«4  at  Psychologic*!  Research  Unit  No.  2.  Classes  4JK  sad  4«C. 
1  Tested  at  Psychological  Research  Unit  No.  1.  Classes  43X  and  44C 

*  Median  si  J  ri  corrected  to  triple  length. 


(4)  Difficulty  level. — The  difficulty  level  of  bombardier  items  is  indi¬ 
cated  by  a  mean  proportion  of  correct  responses  of  0.34,  corrected  for 
chance  success.  The  proportions  range  from  0.00  to  0.89  with  a  standard 
deviation  of  0.26.  These  data  are  based  on  1,400  pilots  in  classes  43K 
and  44C,  originally  tested  at  Psychological  Research  Unit  No.  3. 

(5)  Factorial  composition. — The  test  was  analyzed  in  the  December 
1942  classification  battery  (N~3,000).  Its  only  substantial  loadings  are 
on  the  verbal  (0.44)  and  pilot-interest  (0.33)  factors  in  an  analysis  in 
vhich  mechanical-experience,  perceptual-speed,  numerical,  spatial-rela¬ 
tions,  visualization,  general-reasoning,  and  psychoniotor-coordinaiiuu  fac¬ 
tors  arc  also  found.  It  had  a  communality  of  0.35  in  the  battery,  which 
exhausted  its  noncliancc  variance.  For  a  full  description  of  the  factorial 
composition  of  this  test,  sec  appendix  B. 

(6)  Validity. — Validities  of  the  bombardier  score  for  various  air¬ 
crew  and  technical  specialties  are  presented  in  table  14.13.  This  score 
proved  to  be  more  valid  for  navigators  and  pilots  than  for  bombardiers. 
If  the  three  validities  were  corrected  for  attenuation,  however,  this  con¬ 
clusion  might  not  still  hold. 
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Statist'. -I  results  ( navigator  score). — For  the  navigator  score  the 
data  are  as  follows: 

(1)  Distribution  statistics. — Distribution  data  are  shown  in  table 

14.14. 


Table  14.14. —  Distribution  data  for  Technical  Vocabulary,  CE505C,  navigator  score, 
for  sample:  of  unclassified  aviation  students 


N 

Date  of  testing 

Psychological 
research  unit  No. 

M 

SD 

1.096 

December  1942  . 

1 

11.2 

6.7 

2,376 

September  &  October  1942  . . 

2 

10.3 

6.2 

1,015 

December  1942  . . . 

2 

10.1 

6.4 

1,143 

December  1942  . . . 

3 

10.8 

6.4 

(2)  Internal  consistency. — The  internal  consistency  of  navigator  items 
is  indicated  by  a  mean  phi  of  0.38  with  a  range  from  0.06  to  0.70  and 
a  standard  deviation  of  0.15.  These  data  are  ba~cd  on  the  highest  27 
percent  and  the  lowest  27  percent  in  total  score  of  360  pilots  in  classes 
43K  and  44C,  originally  tested  at  Psychological  Research  Unit  No.  3. 

(3)  Reliability  coefficients. — Two  concordant  estimates  of  reliability, 
corrected  for  length,  appear  in  table  14.15. 


Tabu  14.15. — Reliability  of  Technical  Vocabulary,  CE505C,  navigator  score,  based 

on  samples  of  pilots 


Typo 

H 

ra 

0»* . . . . 

*20u 

0.79 

C.r'c.'ation  of  thiida . 

*365 

•.82 

1  Tested  at  Psychological  Research  Unit  No.  2.  Classes  4JK  and  44C. 
•  Tested  at  Psychological  Research  Unit  No.  3.  Classes  43K  and  44 C. 
a  Median  of  3  r'a  corrected  to  triple  length. 


(4)  Difficulty. — The  difficulty  level  of  navigator  items  is  indicated  by 
a  mean  proportion  of  correct  responses  equal  to  0.32,  corrected  for 
chance  success.  The  proportions  range  from  0.00  to  0.89  with  a  standard 
deviation  of  0.36.  These  data  arc  based  upon  results  from  360  pilot3  in 
classes  45K  and  44C,  originally  tested  at  Psychological  Research  Unit 
No.  3. 

(5)  Factorial  composition. — The  navigator  score  of  Technical  Vocab¬ 
ulary  Test,  CE50SD,  was  analyzed  in  three  different  batteries  having 
a  total  N  of  3,638.  Its  only  significant  loading  is  on  the  verbal  factor 
(weighted  mean  of  0.74),  and  so  it  appears  to  be  a  pure  verbal-ability 
test.  CHlier  factors  that  appeared  in  these  analyses  but  on  which  the  navi¬ 
gator  score  had  inconsequential  loadings  are  mechanical  experience,  per¬ 
ceptual  speed,  numerical,  psychomotor  coordination,  general  reasoning, 
visualization,  spatial  relations,  and  length  estimation.  The  weighted  mean 
of  the  communalitics  of  the  navigator  sc*.re  in  these  three  batteries  is  0.67. 
For  a  full  description  o i  the  factorial  composition  of  this  test,  see  appen¬ 
dix  B. 
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(6)  Validity. — Because  this  test  was  used  in  the  classification  battery, 
extensive  validity  data  are  available  for  it.  Table  14.16  gives  typical 
validation  results  for  air-crew  and  technical  specialties.  Table  14.17 
shows  validation  data  against  the  criteria  of  seven  navigator  grades.  For 
comparison,  validation  data  arc  given  for  the  pitot  and  bombardier  scores 
for  the  same  sample  and  criteria.  It  is  apparent  that  the  test  provides  a 
satisfactory  navigator-selection  score,  and  since  it  correlates  so  slightly 
with  bombardier  and  pilot  criteria,  it  is  also  a  good  classification  test. 


Ta&le  14.17.—  Validity  data  for  pilot,  bombardier  and  navigator  scores  of  Technical 
Vocabulary  CE5Q5C,  against  the  criteria  of  navigation  grade j* 


Grade 

Score 

r 

r  8 

Dead  reckoning  (ground  school)  . . . 

p 

0.09 

0.15 

O 

.IS 

.23 

N 

.21 

.36 

p 

-.01 

.OS 

B 

.10 

.17 

N 

.08 

.22 

p 

.01 

.0$ 

B 

.04 

.09 

N 

.00 

.10 

Celestial  navigation  (flight)  . . . 

P 

.03 

.08 

B 

.02 

.09 

N  . 

.04 

.17 

Meteorology . . . . . 

P 

.09 

.14 

B 

.13 

.20 

N 

M 

.44 

Military  . 

P 

-.03 

.00 

B 

.02 

.OS 

N 

.0$ 

.12 

P 

.04 

.11 

B 

.11 

.20 

N  1 

.1$ 

.31 

. .  -  .——a 

•For  •  »ampte  of  trainee!  in  Hondo  classes  4J-10  through  43-IS.  For  the  bombardier  score, 
a  sample  of  426  examinees  tested  at  Psychological  Research  Units  Nos.  1  and  2  waa  used.  For 
the  other  scores,  the  sample  comprised  463  examinee*  from  Psychological  Research  Units  Nos. 
I,  2,  and  3. 

*  Assumed  unrestricted  stanine  standard  deviation  not  reported.  All  r’t  are  product-moment 
correlations. 

(7)  Part-score  intcrcorrclations. — The  three  scores  intercorrelate  as 
follows :  »>#  =  0.20,  rpB  —  0.34,  rBB  =  0.54.  Here  it  is  satisfying  to  find 
that  the  two  scores  (pilot  and  navigator)  most  valid  for  their  own  spe¬ 
cialties,  and  at  the  same  time  fairly  reliable,  correlate  so  little  with  each 
other.  This  makes  the  two  tests  excellent  classification  as  well  as  selection 
instruments. 

Evaluation. — Some  of  the  defects  noted  in  the  Technical  Vocabulary 
Test,  CE505C,  are  as  follows : 

(а)  The  reliability  of  the  bombardier  score  is  so  low  that  it  is  of 
questionable  value  for  classification  purposes,  even  when  it  carries  a 
small  percentage  of  the  weight  in  a  composite  score. 

(б)  The  bombardier  score  overlaps  the  navigator  score  so  much  (rBB 
approaches  1.00  when  corrected  for  attenuation)  that  a  separate  score 

has  little  meaning  other  than  deviation  due  to  sampling  errors.  Its  load- 

\ 
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ing  on  ihe  aviation-interest  factor  does  indicate,  however,  that  it  is  not 
functionally  identical  with  the  navigator  score. 

In  view  of  these  considerations,  the  Technical  Vocabulary  Test  was 
revised,  in  the  process  of  which  the  bombardier  score  was  dropped,  and 
the  title  was  changed  to  General  In  formation  Test,  CE505D. 

The  pilot  validities  to  be  expected  from  the  factorial  configurations 
of  the  three  scores  are  0.13,  0.1 1,  and  0.19,  respectively,  for  bombardier, 
navigator,  and  pilot  sections.  Averages  of  obtained  pilot  validities  are 
0.12,  0.09,  and  0.21,  respectively.  The  closeness  of  predicted  to  obtained 
pilot  validities  indicate  that  the  full  reasons  fur  the  latter  are  known  in 
terms  of  common  factors. 


General  Information  Test,  CE505D  * 

This  test,  a  revision  of  the  Technical  Vocabulary  Test,  CES0SC,  is 
designed  to  measure  various  types  of  background  information  as  an  in¬ 
dication  of  interests  suitable  for  training  as  a  pilot  or  navigator. 

Description.  (1)  Internal  characteristics . — The  revision,  besides  de¬ 
leting  the  bombardier  section  of  the  earlier  test,  extends  the  pilot  section' 
to  60  items,  with  revision  of  certain  items.  The  navigator  items  were  left 
unchanged.  The  resulting  test  thus  consists  of  60  pilot  and  40  navigator 
items.  The  added  pilot  items  are  more  concerned  with  background  sports 
and  hobbies  than  with  flying  experience.  The  following  item  illustrates 
the  new  type  of  material.  The  other  items  are  of  the  types  illustrated  in 
connection  with  test  CE505G 


On  most  motorcycles,  the  throttle  is  operated  by: 

A.  Pressing  a  lever  on  the  handle  bar. 

B.  Depressing  a  foot  pedaL 

G  Turning  one  of  the  handle  grips. 

D.  Depressing  a  foot  pedal  for  quantity  and  turning. 

E.  Don't  know. 

(2)  Administration. — The  time  allowed  for  the  test  is  36  minutes, 
and  Ute  3  parts  are  separately  timed.  After  14  minutes  ali  examinees  are 
instructed  to  go  on  to  part  II  (items  40-78),  even  if  they  have  not  fin¬ 
ished  part  I  (items  1-39),  and  after  28  minutes  Ihey  are  instructed  to 
go  on  to  part  III. 

(3)  Scoring— The  test  is  scored  both  for  pilot  items  and  navigator 
items.  The  score  is  the  number  of  right  responses.  For  certain  items 
which  had  been  found  to  bear  a  negative  relationship  to  pilot  success, 
the  incorrect  responses  in  terms  of  truth  or  fact  are  all  keyed  a)  rights. 

Statistical  results  ( pilot  score). — Results  will  be  presented  for  the 
pilot  score  only,  since  the  navigator  score  is  identical  with  that  of  form 
CE505C. 


•  Developed  tn  the  Oftce  of  the  Air  Sur,eon.  Heidfliurtere,  AAF.  Chief  CMlriboieret 
TtcVSgt.  Robert  R.  Weke,  Cept  Frederick  B.  Deri*,  end  Cope.  D*o*ld  K.  Soper. 
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(1)  Distribution  statistics. — Using  a  sample  of  3,000  unclassified  avi¬ 
ation  students  (tested  at  all  three  Research  Units),  a  mean  of  347  and 
a  standard  deviation  of  6.4  were  obtained  for  the  pilot  score. 

(2)  Reliability  coefficient. — A  corrected  reliability  coefficient  of  0.87 
was  obtained  by  the  odd-even  method  on  a  sample  of  1,500  unclassified 
aviation  students.  This  is  a  satisfying  improvement  over  the  reliability 
for  CE505C  pilot  score  (r, ^approximately  0.80). 

(3)  Factorial  composition.—^ General  Information  Test,  CE505D,  was 
analyzed  in  the  July  1943  classification  battery.  The  pilot  score  has  load¬ 
ings  of  0.38  on  the  pilot  interest  factor,  0.35  on  the  verbal  factor,  and 
0.30  on  die  mechanical-experience  factor.  It  has  negligible  loadings  on 
the  perceptual -speed,  numerical,  spatial-relations,  psychomotor-coordina¬ 
tion,  and  general-reasoning  factors.  This  represents  a  slight  improvement 
in  the  intended  directions — less  verbal  variance  and  more  pilot-interest 
variance.  Its  communality  with  the  other  tests  in  the  battery  is  0.43.  For  a 
full  description  of  the  factorial  composition  of  this  test,  see  appendix  B. 

(4)  Validity. — Table  14.18  gives  the  validity  of  the  pilot  score  for 
pilot  training. 


Tabuc  14.18. —  Validity  data  for  Gtntral  Information ,  CE505D  (pilot  score), 
graduation-elimination  criterion 


•  Corrected  to  on  unrestricted  itanin*  standard  deviation  o t  2.00. 

•  In  claaa  44E.  Tested  at  Psychological  Research  Units  Nos.  I,  2,  and  H 


Evaluation. — The  revisions  made  in  the  pilot  section  of  this  test  had 
the  effect  of  increasing  its  validity  roughly  from  0.21  to  0.24.  Its  most 
important  loadings  arc  on  the  pilot-interest  and  mcchanical-cxpcricnce 
factors.  The  loading  in  the  pilot-interest  factor  indicates  the  value  of 
the  test  as  a  measure  of  interest  and  motivation.  The  pilot  validity  to 
be  expected  from  the  known  factors  is  0.20.  Since  the  expected  and  ob¬ 
tained  pilot  validities  arc  not  far  apart,  we  may  conclude  that  all  valid 
factors  in  the  test  probably  arc  accounted  for.  The  presence  of  the  ver¬ 
bal  variance,  even  in  reduced  amount,  is  of  sonic  concern,  since  this  has 
slight  negative  validity  for  the  pilot 

The  navigator  section  of  the  test  is  the  same  as  in  test  CE505C,  and 
it  continued  to  yield  good  results.  It  was  decided  to  drop  the  navigator 
section  of  the  test  in  the  next  revision  (CE505E),  however,  for  the 
following  reasons: 

(a)  The  only  significant  variance  in  the  navigator  score  is  due  to  the 
verbal  factor,  which  is  adequately  measured  by  other  tests  in  the  bat¬ 
tery,  e.  g.,  Reading  Comprehension  Test,  0614. 

(b)  Mathematics  Test,  CI702E,  correlates  slightly  higher  with  ex¬ 
pressed  preference  for  navigation  training  (r=0.59)  than  does  the 
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navigator  score  of  General  Information,  CE505D  (r—O.SS).  Tins  indi¬ 
cates  that  if  it  is  navigation  interest  that  is  to  be  measured  by  die  infor¬ 
mation  test  (factorial  results  fail  to  exhibit  such  a  factor),  then  the 
mathematics  test  measures  it  at  least  as  well,  and  possibly  better. 

General  Information  Test*  CE505E  * 

When  the  tests  to  be  included  in  the  November  1943  Gassification 
Battery  were  selected,  it  was  decided  to  construct  a  new  General  Infor¬ 
mation  Test  to  replace  form  CES05D.  Accumulated  data  indicated  that 
items  of  the  following  types  should  be  included; 

(1)  Aviation-information  items  of  the  kinds  used  in  General  Infor¬ 
mation  Test,  CE505D. 

(2)  Flying-information  items  developed  and  validated  at  Psychologi¬ 
cal  Research  Unit  No.  1. 

(3)  Driving- in  formation  items  of  the  type  developed  and  validated  at 
Psychological  Research  Units  No.  1  and  No.  3. 

(4)  Mechanical-information  items  developed  and  validated  at  Psy¬ 
chological  Research  Unit  No.  3  (the  Mechanical  Information  Test, 
CI905A,  having  been  removed  from  ’he  classification  battery). 

(5)  Technical-vocabulary  items  developed  anti  validated  at  Head¬ 
quarters,  Army  Air  Forces,  Office  of  The  Air  Surgeon. 

(6)  Sports-and-hobbiei  items  developed  and  validated  at  Psychologi¬ 
cal  Research  Unit  No.  !  and  at  Headquarters,  Army  Air  Forces,  Office 
of  The  Air  Surgeon. 

Selection  of  items. — When  the  items  for  the  final  form  of  Test 
CE505E  were  selected,  a  number  of  considerations  in  addition  to  the 
anticipated  relative  sizes  of  the  beta  weights  were  taken  into  account. 
Among  them  were  the  sizes  of  the  standard  deviations  of  groups  of  the 
five  kinds  of  items,  the  general  specifications  for  the  test,  and  the  amount 
of  reliable  validity  data  for  the  five  kinds  of  items. 

The  selection  of  individual  items  was  done,  so  far  as  possible,  on  the 
basis  of  individual  item-validity  coefficients  and  difficulty  indices.  Items 
having  the  highest  validity  coefficients  ami  the  most  appropriate  difficulty 
indices  were  used.  A  valid  item  was  not  chosen,  however,  unless  the 
median  validity  coefficient  of  all  the  items  of  its  type  was  positive.  One 
objective  was  to  use  as  many  valid  items  as  possible  at  the  level  of  ability 
represented  by  the  cut  off  point  for  pilot  selection  by  the  classification 
battery.  Another  aim  was  to  minimize  the  imcrcorrclations  of  the  indi¬ 
vidual  items  by  utilizmg  items  covering  a  wide  range  of  topics. 

•  Drrr looH  *<  Ptyx lioJor ic»l  Kfwartk  Uwl  N*.  J.  Ckxf  cwrtrikoton:  FttJiriffc  X 

D.».»  and  (Up*.  Uoyd  O.  Itum^fcrejr*. 
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Description.  (1)  Internal  characteristics. — The  test  is  divided  into 
three  parts.  Part  I  contains  25  aviation-interest  items  and  has  an  admin¬ 
istration  time  of  10  minutes;  part  II  contains  32  flying-information  items 
and  has  an  administration  time  of  12  minutes;  and  part  III  contains  43 
items  of  mechanical  information,  driving  information,  and  sports  and 
hobbies  participation,  and  requires  14  minutes  for  administration. 

Five  sample  items  are  shown  below,  exemplifying  the  areas  of  avia¬ 
tion  interest,  sports  and  hobbies,  mechanical  information,  driving  infor¬ 
mation,  and  flying  information,  in  order. 

Which  one  of  the  following  is  most  commonly  used  to  train  pilots  on  the  ground? 

A.  The  Waco  Trainer. 

B.  The  Ryan  Trainer. 

C  The  Fairchild  Trainer. 

D.  The  White  Trainer. 

E  The  Link  Trainer. 

The  strongest  type  of  construction  in  skis  is  called: 

A.  Concave  topi 

B.  Flat  top. 

C  Ridge  topi 
D.  Roof  topi 
E  Don't  know. 

With  pressure  on  the  starter  switch,  the  starting  motor  runs  smoothly,  but  n  ►  con¬ 
tact  is  made  between  the  starting  motor  and  the  engine.  The  most  probable  cause  of 
the  trouble  is  that: 

A.  The  armature  of  the  starting  motor  is  loose. 

B.  The  brushes  in  the  starting  motor  are  not  making  contact  with  the  commu¬ 

tator. 

C  The  Bcndix  spring  is  broken. 

D.  A  fuse  is  blown. 

E  The  ignition  coil  is  not  functioning  properly. 

If  you  were  driving  along  at  50  miles  per  hour  and  the  right  front  tire  blew  out. 
it  would  be  best  to  tighten  your  hold  on  the  steering  wheel  and: 

64-A  Step  lightly  on  the  brake  pcdaL 

64- B  Step  hard  on  the  brake  pedaL 

64- C  Turn  the  wheels  slightly  to  the  right 

64-D  Disengage  the  clutch  and  let  the  car  coast  to  a  standstill 

64-E  Turn  off  the  ignition  and  let  the  car  roll  to  a  standstill  in  gear. 

Using  too  much  bottom  rudder  in  a  steep  turn  will  cause  the  plane  to: 

A.  Slip. 

R  Stall 

C.  Gain  altitude  rapidly. 

D.  Perform  a  spiral 
E  Don’t  know. 
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(2)  Scoring. — The  scoring  formula  is  R— W/4.  The  test  is  scored 
only  for  pilots. 

Statistical  results.  (1)  Distribution  statistics. — Distribution  statistics 
were  obtained  for  both  pre-CTD  (College  Training  Detachment)  and 
post-CTD  groups,  and  are  presented  in  table  14.19.  The  students  while 
in  college  detachments  received  as  much  as  10  hours  of  flying  experi¬ 
ence.  This  would  account  for  the  difference  in  mean  score,  at  least  in 
part 


Table  14.19. —  Distribution  statistics  for  unclassified  aviation  students  on  Central 


Information  Test,  CE50SE 


Crmsg 

N 

M 

SD 

Post -college* . . . 

l.*O0 

1.920 

454 

ifj 

lit 

Prc-collcf e* . 

'Tested  at  Psychological  Research  Units  Nos.  I.  2,  and  1  with  the  Nor  ember  INI  Battery. 
*  Tested  at  Medical  and  Psychological  Examining  Units  Nos.  4  to  10  tochtaiee. 


(2)  Internal  consistency. — Since  the  test  is  composed  of  five  types  of 
items,  a  single  item-analysis  based  on  total  score  would  be  meaningless. 
Five  separate  item-analyses,  therefore,  were  made,  correlating  items 
with  the  total  score  of  the  group  to  which  the  item  belongs.  The  item 
statistic  used  was  Flanagan's  r  (1).  The  data  arc  presented  in  table 
14.20. 


Table  14.20. —  Data  on  internal  consistency  for  types  of  items  of  Central  Informa 
tion  Test,  CE505E  based  on  a  sample  of  740  classified  pilot / 


Type  of  Rems 

Number  of  items 
in  criterion 

S», 

Range  of  r 

Aviation  Informal!— . 

IS 

0.42 

0.14 

0.00-0.94 

Sports  and  hobbies . 

14 

.44 

.JO-.JO 

Mechanical  information . 

>6 

.St 

.11 

.10-.  99 

Driving  information  . 

12 

.44 

.10 

.11 -M 

Flying  information . . 

19 

.41 

.09 

J1-.M 

*  Tested  at  Psychological  Research  Unit  No.  1  in  November  IMS. 


The  five  parts  are  seen  to  be  quite  homogeneous  internally.  The  small 
number  of  items  in  each  part  enhances  the  apparent  homogeneity,  how¬ 
ever,  and  it  is  difficult  to  tell  how  much  to  attribute  to  the  spurious  part- 
whole  correlation. 

The  five  part  scores  were  intcrcorrclatcd,  and  the  correlation  coeffi¬ 
cients  were  corrected  for  restriction  of  range  resulting  from  selection 
on  the  pilot  staninc.  These  data  arc  given  in  table  14.21.  There  is  there¬ 
fore  considerable  heterogeneity  as  between  types  of  items. 
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Tabu  \42\.—  Inlercorrelationt  of  five  part-scores  of  General  Information  Test, 
CE505E,  eorretled  for  restriction  of  range*  for  a  sample  of  740  classified  pilots 


t 

2 

) 

4 

$ 

t.  Aviation  information  . 

•  on* 

0.18 

0.42 

0.24 

0.49 

1  Sport*  and  hobble*  . 

OK 

o  •  •  • 

.32 

.28 

.IS 

1.  Mechanical  information  . 

.42 

.22 

.  O  O  0 

.47 

.30 

4.  Driving  information  . 

.24 

.28 

.47 

H8c®> 

.11 

S.  Flying  information  . 

.49 

.It 

.30 

HQ 

•  o  o  o 

*  Auuming  in  unreilricted  rUndard  deviation  of  2.00  for  the  pilot  ttinine. 


(3)  Reliability  coefficient. — Two  reliability  estimates  arc  given  in 

table  14.22, 


Table  14.22. — Reliability  estimates  for  General  Information  Test ,  CE505E,  based 
upon  samples  cf  unclassified  aviation  students 


Typo 

N 

'a 

'1,000 

•300 

0.77 

0S7 

.73 

M 

1  Tritcd  it  Medical  and  Pkycho'oxicai  Examining  Unit  No,  7. 

*  Tcitcd  at  Medical  and  Piycholofieal  Examining  Unit  No.  It, 


(4)  Difficulty . — The  difficulty  level  of  items  in  the  test  is  indicated  by 
a  mean  proportion  of  correct  responses  of  0.48.  corrected  for  chance 
success.  The  proportions  range  from  0.02  to  0.99,  with  a  standard  devia¬ 
tion  of  0.25.  These  data  are  based  upon  results  for  450  unclassified 
aviation  students  (pre-college),  tested  at  Psychological  Research  Unit 
No.  3  in  October  1943. 

(5)  Factorial  composition. — The  highest  loadings  for  this  test  are 
on  the  mechanical-experience  (0.53),  verbal  (0.43),  and  perceptual- 
speed  (0.29)  factors,  in  ?,  battery  in  which  spatial-relations,  psycho¬ 
motor-coordination,  numerical,  mathematics-background,  social-science- 
background,  and  kincsthctic-motor  factors  are  also  found. 

No  pilot-interest  factor  was  isolated  in  this  analysis.  The  communality 
in  this  battery  (November  1943)  is  0.65.  The  matrix  of  which  this  test 
was  a  part  presented  many  difficulties,  and  the  factorial  solution  is  not 
entirely  satisfactory.  These  results,  therefore,  must  be  taken  with  some 
reservations.  For  a  full  description  of  the  factorial  composition  of  this 
test,  see  appendix  B. 

(6)  Validity. — Validity  data  are  available  both  for  total  score  and  part 
scores.  Table  14.23  gives  validity  for  total  score  for  pilots,  WASPs, 
air  mechanics,  and  armament  trainees,  and  tabic  14.24  gives  validities  of 
the  part  scores  and  the  total  score  for  pilots. 
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T AJiLfc  14.24. —  Volidily  data  for  'he  five  fart-stored  of  General  Information  Test, 
CH505F. ,  for  elementary  pilot  training 
(,Vi  =  1076,  p.--m 


Score 

So.  of 
ittma 

>», 

SO, 

rn. 

1.  Aviation  information  . . . 

IS 

20.24 

11.61 

S.72 

0.15 

0.25 

2.  Spui'a  and  hobbit*  .... 
J.  Mechanical  information 

IS 

2.49 

J.26 

2.18 

.06 

.11 

20 

9.58 

8.76 

4.46 

.10 

.21 

4.  Driving  information  ... 

12 

4.55 

J.95 

2.26 

.14 

.21 

5.  Hying  information  .... 

20 

12.74 

11.92 

2.94 

.15 

.22 

i  Total  xort  . . 

100 

50.62 

46.50 

11.92 

.18 

.10 

1  Tlit*e  iiuI'hoici  »crc  'Mamed  after  eleven  of  (he  ilenia  bad  been  rcclaasificd  from  one  poet 
to  another  on  the  bi-i«  of  correlating  each  item  with  the  five  part  acorea. 

'Corrected  to  an  unrect.  i>  teJ  alanine  standard  deviation  of  l.l). 

1  Toted  at  Ptychological  Kotarcb  Unit  No.  J.  Clast  not  reported. 

Evaluation. — The  revisions  incorporated  in  this  form  raised  its  valid* 
ity  for  pilots  to  approximately  0.32.  It  is  dear  that  General  Information 
Test,  CE505E,  is  a  highly  valuable  test  in  the  classification  battery.  Its 
validity  for  pilot  training  is  exceeded  by  few  tests.  Since  it  is  a  complex, 
tailor-made  test,  however,  its  usefulness  is  restricted  largely  to  the  pur¬ 
pose  for  which  it  was  constructed. 

Its  pilot  validity  is  not  by  any  means  fully  accounted  for  by  its  load¬ 
ings  in  the  mechanical  and  perceptual  factors,  though  much  of  it  must 
be  so  allocated.  The  pilot-interest  factor,  which  appeared  in  both  fore¬ 
runners  of  this  test,  must  have  been  increased  in  weight,  perhaps  to  as 
much  as  0.50,  judging  by  the  facts.  Its  present  mechanical  variance  (28 
percent,  if  correctly  estimated)  is  much  too  high,  and  that  variance  is 
abundantly  covered  by  other  tests  in  the  classification  battery. 

General  Information  Test,  GE505F  * 

This  test  is  a  revision  of  form  CE505E. 

Informal  job  analyses  of  combat  flying  led  to  the  conclusion  that  in¬ 
creasing  the  number  of  s’ports-and-hobbics  items  (especially  those  nega¬ 
tively  keyed),  and  the  number  of  mechanical-information  items  would 
make  the  test  more  prognostic  of  combat  success  by  enhanring  variance 
in  an  assumed  factor  of  masculinity- femininity. 

Description.  (1)  Internal  characteristics. — Part  I  contains  50  items 
of  aviation  interest  and  flying  information,  and  part  II  contains  60  items 
of  sports-md-hobbics  participation.  Six  mechanical-information  items 
borrowed  from  the  Mechanical-Information  test,  C1905B,  arc  included 
in  the  general  information  score. 

(2)  sidminisirotion. — The  examinee  is  told  that  if  nc  completes  part 
I  before  time  is  up,  he  may  continue  with  part  II.  If  he  has  not  com¬ 
pleted  part  I  when  time  is  up,  he  is  to  go  on  to  part  II  at  that  time.  The 
time  is  20  minutes  for  each  part. 

(3)  Scoring. — The  score  is  simply  the  number  of  positively  weighted 
responses.  For  items  where  the  correct  response  has  negative  validity 

1  Dniloptj  tt  KtKihk  Unit  N».  J.  Cllrf  MdriirtMt:  CgL  Albert  A.  Cmm- 

kl4,  Jr,  It  Dav>4  H.  Jtni.nu,  ut  Pvt.  Jwtt  A.  Wtlktr. 
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(music,  current  events,  etc.),  any  mislead  is  considered  the  right  an¬ 
swer. 

Statistical  results.— This  test  was  constructed  and  administered  late 
in  the  war  (beginning  September  19-11),  and  so  only  distribution  statis¬ 
tics  are  available. 

(1)  Distribution  statistics. — Using  a  sample  of  470  unclassified  avia¬ 
tion  students  tested  in  July  1944  at  Psychological  Research  Unit  No.  3, 
a  mean  of  69.9  and  a  standard  deviation  of  10.6  were  obtained. 

General  Information,  CE505FX2,  GX2  * 

These  forms  consist  of  189  items  of  flying  information.  They  were 
administered  to  unclassified  aviation  students  and  will  form  a  pool  of 
prc-validatcd  items  for  use  in  future  revisions  of  the  ftying-inforo.ation 
section  of  General  Information,  CF.505F.  The  items  in  Part  I  of  each 
form  have  only  one  correct  answer.  The  items  of  Part  II  may  have 
several  correct  answers.  Two  sample  items  arc  presented: 

Fran  Part  I: 

Coordination  exercises  require: 

A.  Spiral! 

B.  Power-off  stalls. 

C  Pylon  eight! 

D.  Slip! 

E.  Skid! 

From  Part  II: 

An  autogiro  will: 

A.  Bant 

B.  Roll 
C  Yaw. 

D.  Pitch. 

&  Spin. 

Statistical  result 4  The  GX2  form  was  administered  to  3,000  unclas¬ 
sified  aviation  students  Statistical  results  were  not  available  at  the  time 
of  this  writing.  The  following  results  are  available  for  the  FX2  form, 
for  examinees  tested  in  April  and  May  1944  at  Psychological  Research 
Unit  No.  3. 

(1)  Reliability  coefficient  — By  the  alternate- forms  method,  an  esti¬ 
mated  reliability  coefficient  of  0.84.  corrected  for  length,  was  obtained. 
This  figure  is  based  on  a  sample  of  927  unclassified  aviation  students. 

(2)  Test  validity. — A  sample  of  928  pilots  yielded  a  biscrial  correla¬ 
tion  of  0.37,  corrected  for  restriction  of  range,  between  performance  in 
this  test  and  the  graduation-elimination  criterion  in  primary  training. 
The  mean  score  for  graduates  was  63.86,  for  eliminces  58.26,  and  the 
standard  deviation  for  both  combined  was  1205.  Of  this  sample,  79 
percent  was  graduates,  and  the  standard  deviation  assumed  for  the  un¬ 
restricted  pilot  staninc  distribution  was  2.00. 

•  •'  r.rtlclctk.l  Ror.r<*  U.i«  Sw  1.  cw  t— U  Ck~W.  It.  Drr- 

tfckk  uvj  U.  WtUua  SI.  Wfc**h». 
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Central  Information,  CE505FX3,  GX1  ,c 


These  tests  consist  of  126  aviation-interest  items.  The  questions  con¬ 
cern  airplanes,  tactics,  airplane  identification,  etc.  They  were  adminis- 
lercd  to  unclassified  aviation  students  to  provide  a  backlog  of  validated 
items  to  be  used  in  future  revisions  of  the  aviation-interest  section  of 
General  Information,  CE505F. 

Statistical  results. — The  GX1  form  was  administered  to  3,000  unclas¬ 
sified  aviation  students.  Statistical  results  were  not  available  at  the  time 
this  volume  was  being  written.  The  following  results  were  available  for 
the  FX3  form,  for  pilots  in  class  44J,  originally  tested  at  Psychological 
Research  Unit  No.  3.  > 

(1)  Reliability  coefficient. — By  the  odd-even  method,  an  estimated  re¬ 
liability  coefficient  of  0.81,  corrected  for  length,  was  obtained.  This  fig¬ 
ure  is  based  on  a  sample  of  939  classified  pilots.  The  score  used  was 
rights  only. 

(2)  Test  validity. — A  sample  of  927  pilots  yielded  a  biscrial  correla¬ 
tion  of  0.29,  corrected  for  restriction  of  range,  against  the  graduation- 
elimination  criterion  in  primary  training.  The  mean  score  for  graduates 
was  40.51,  for  eliminccs  37.22,  and  the  standard  deviation  for  both  com¬ 
bined  was  9.06.  Of  this  sample,  68  percent  was  graduates,  and  the  stand¬ 
ard  deviation  assumed  for  the  unrestricted  pilot  stanine  distribution 
was  2.00 

General  Information,  CES05GX8  11 

This  test  consists  of  65  items  designed  to  assess  knowledge  of  avia¬ 
tion  slang.  The  slang  terms  are  those  that  would  be  used  on  the  flight 
line  or  in  publications  on  flying.  The  test  was  administered  to  3,000  un¬ 
classified  aviation  students  and  will  be  used  as  a  backlog  of  prevalidatcd 
items  for  future  rcvis.ous  of  the  aviation-interest  section  of  General  In¬ 
formation,  CE505F.  Two  typical  items  are : 

"Umbrella  men”  are: 

A.  Glider  pitots. 

B.  Autogiro  pilots. 

C.  T ransport  pilots. 

D.  Paratroopers. 

EL  Men  who  have  bailed  out. 

A  "mickey”  is: 

A.  A  supercharger. 

B.  An  aerial  radar  unit 

C.  A  Sperry  turn-bank. 

D.  A  pressurized  cabin. 

EL  A  droppable  gas  tank. 

"  Constructed  at  Pjychologic.it  Research  Unit  No.  3.  Chief  contributors:  CpI.  Albert  A.  Can- 
field.  Jr..  T/Sgt.  Sanford  J,  Mock. 

"  Developed  at  Psychological  Research  Unit  No.  3.  Chief  contributors:  Cpt.  Letand  D. 
Orukaw  and  Cpt  Robert  E.  Lambert. 
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Evaluation  of  General  Information  Test* 

Validities  for  the  tests  in  this  section  indicate  that  the  amount  of 
knowledge  and  information  about  Hying  acquired  prior  to  air-crew  train¬ 
ing  is  a  valid  indicator  of  the  background  and  interest  conducive  to  suc¬ 
cess  in  flying  training.  Analysis  showed  general-information  tests  to  be 
factorially  complex.  In  addition  to  variance  in  the  mechanical-experi¬ 
ence  and  verbal  factors,  they  contained  a  pilot-interest  factor.  This  was 
particularly  true  of  the  pilot  score  which  was  based  on  questions  of  (1) 
flying  information,  (2)  aviation  information,  (3)  sports  and  hobbies,  and 
(4)  driving  information.  The  pilot-interest  component  of  the  test  proved 
to  be  as  valid  a  contributor,  perhaps  slightly  more  so,  as  a  measure  of 
motivation  for  flying  training  than  direct  expressions  of  strength  of 
interest  in  flying  (see  ch.  26).  Assuming  the  validity  of  the  pilot-interest 
factor  to  be  0.25  (see  table  28.17),  and  the  loading  of  this  factor  in  the 
test  to  be  0.50,  the  factor’s  contribution  to  the  test  validity  would  be  the 
product  of  these  two  values,  or  0.125.  The  validity  of  self-ratings  of 
pilot  interest  are  generally  less  than  that. 

The  navigator  score  was  less  useful,  because  it  seemed  to  be  duplicat¬ 
ing  information  obtained  from  mathematics  and  verbal-test  scores. 

The  bombardier  score  proved  to  be  too  unreliable  to  be  useful  for 
predictive  purposes  and  seemed  to  contribute  nothing  unique. 

BIBLIOGRAPHY 

(1)  Flanagan,  J.  C.  A  Table  of  the  Values  of  the  Produel-htomenl  Coefficient  of 
Correlation  in  a  Normal  Divariate  Population  Corresponding  to  Given  Pro¬ 
portions  of  successes,  New  York,  Privately  printed,  1936. 
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CHAPTER  FIFTEEN 


Perceptual  Tests1 


PERCEPTUAL  REQUIREMENTS  OF  AIR-CREW  DUTIES 

Even  a  superficial  acquaintance  with  the  duties  of  the  bombardier, 
navigator,  and  pilot  tempts  one  to  speculate  that  perceiving — the  appre¬ 
hension  of  objects  and  events  as  present  and  going  on  now  (1) — is  an  ac¬ 
tivity  basic  to  successful  air-crew  performance.  The  specification  of  the 
important  and  statistically  independent  perceptual  abilities,  however, 
which  would  have  made  possible  an  economical  and  searching  investiga¬ 
tion  of  the  area,  was  lacking.  In  the  absence  of  such  systematic  knowl¬ 
edge,  the  direction  of  perceptual  research  was  determined  in  large  part 
by  both  formal  and  informal  job  analyses.  Of  these,  perhaps  the  most 
important  analysis,  historically  speaking,  was  that  of  the  faculty-board 
proceedings  in  the  elimination  of  1,000  aviation  students  from  further 
elementary  pilot  training.  This  analysis  was  the  source  of  the  classifica¬ 
tion  of  perceptual  tests,  and  thus  provided  the  basic  framework  for  test 
research  and  construction  in  this  area. 

The  reasons  for  elimination  stated  in  the  proceedings  were  placed, 
upon  analysis,  into  four  categories:  Coordination  and  technique,  intel¬ 
ligence  and  judgment,  personality  and  temperament,  and  alertness  and 
observation.  The  latter  category  was  taken  to  coincide  with  the  area  of 
perception.  Statements  subsumed  under  this  category  were  found  in  70 
percent  of  the  eliminations. 

The  break-down  of  this  gross  category  of  alertness  and  observation 
provided  the  coding  system  for  perceptual  test  construction.  In  the  list 
that  follows,  the  first  six  coded  categories  ami  definitions  arc  taken  from 
the  published  analysis  of  the  faculty-board  proceedings;  the  last  two 
categories  were  added  later.  F.ach  perceptual  test-code  symbol  begins 
with  the  letters  CP  followed  by  a  thrcc-placc  number  in  one  of  the  fol¬ 
lowing  groups. 

100.  Visualization  of  flight  course. — Ability  to  "get  out  of  the  cock¬ 
pit"  and  fly  the  plane  with  reference  to  the  horizon  and  reference  points, 
as  shown  by  the  ability  to  handle  ground  pattern  work,  maintain  constant 
altitude,  control  the  direction  of  the  plane,  make  turns  of  the  desired 
amount,  etc. 

1  Written  by  Capt,  John  (.  L *e«y. 
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200.  Estimation  of  speed  and  distance. — Ability  to  make  such  esti¬ 
mates  of  speed,  distance,  and  altitude  as  arc  required  in  flying  a  course, 
flying  in  formation,  gliding,  landing,  etc. 

300.  Sense  of  sustentation. — Ability  to  sense  support  or  lack  of  sup¬ 
port  of  the  airplane,  and  thus  detect  slips,  skids,  or  the  approach  of  a 
stall. 

400.  Division  of  attention. — Ability  of  the  pilot  to  remain  alert  and 
observant  of  tilings  around  him  while  flying  and  at  the  same  time  attend 
to  all  the  necessary  details  and  carry  on  all  the  different  activities  neces¬ 
sary  for  precise  flying. 

500.  Orientation. — Ability  to  find  one’s  correct  geographic  position 
by  the  use  of  any  available  means,  such  as  familiar  reference  points  that 
arc  visible  on  the  ground,  identification  of  the  area  below  as  it  is  repre¬ 
sented  on  charts  or  maps,  etc. 

600.  Speed  of  decision  and  reaction. — Ability  to  think  quickly,  to  make 
rapid  decisions,  or  to  respond  with  speed  and  precision  when  the  situa¬ 
tion  demands. 

700.  Auditory  discrimination. 

800.  Form  perception. 

Data  on  the  rclati/e  importance  of  these  categories  for  bombardiers, 
navigators,  and  pilots  may  be  found  in  chapter  1,  where  it  may  be  seen 
that  perceptual  abilities,  in  general,  stand  rather  high  on  the  lists  of 
required  abilities,  both  in  training  and  combat. 

The  critical  reader  may  well  question  why  this  list,  the  result  of  one 
early  job  analysis,  was  accorded  such  a  prominent  place  in  the  program 
of  test  construction.  It  is  derived  from  a  series  of  comments  made  by 
psychologically  naive  board  members,  which  were  later  ordered  into  a 
list  with  some  psychological  meaning.  Its  defects  as  a  contribution  to  a 
systematic  psychology  of  perception  are  obvious.  The  list  did  have,  how¬ 
ever,  two  great  advantages.  First,  it  provided  a  clue  to  important  per¬ 
ceptual  activities,  tests  of  which  had  some  promise  of  validity;  and 
second,  it  was  broad  and  permissive,  providing  a  flexible  if  not  rigorous 
scheme  of  classification.  In  a  time  when  test  construction  could  not  await 
the  detailed  job-analysis  findings  of  psychologically  informed  investiga¬ 
tors,  these  advantages  were  sufficient  to  justify  the  use  of  these  early 
findings  in  outlining  a  program  of  test  construction. 

AN  OVER-ALL  VIEW  OF  PERCEPTUAL  TEST  CONSTRUCTION 

Without  anticipating  the  detailed  discussion  to  follow,  subarea  by 
subarea  and  test  by  test,  it  seems  desirable  to  provide  an  over-all  view 
of  test-construction  activity  in  the  field  of  perception. 

It  should  be  noted,  first  of  all,  that  no  tests  were  constructed  in  the 
important  areas  of  sensory  psychology.  Visual  and  auditory  capacities 


were  the  concern  of  the  medical  officer,  and  while  civilian  research  psy¬ 
chologists  made  important  contributions  to  the  field,  military  psycholo¬ 
gists  devoted  their  attention  to  nonsensory  problems. 

The  coded  categories  set  forth  in  the  preceding  paragraphs  served  as 
the  basic  framework,  as  has  been  stated,  for  perceptual  test  construc¬ 
tion.  In  the  chapters  to  follow,  however,  an  uneven  distribution  of  effort 
over  these  categories  is  apparent.  Systematic  considerations  and  relative 
promise  of  high  validity  account  for  some  of  this  inequality  of  efTort, 
of  course,  but  it  is  due  mainly  to  the  fact  that  test-construction  activity, 
especially  in  the  later  phases  of  the  program,  was  directed  primarily  by 
validation  returns  and  by  factor-analysis  results. 

As  an  example  of  uneven  distribution  of  efTort,  it  was  discovered  very 
early  that  a  simple  speed  test  of  the  ability  to  match  airplane  silhouettes 
was  moderately  valid  for  elementary  pilot  training.  This  test,  Speed  of 
Identification,  quite  obviously  resembled  the  Identical  Forms  test  of  L.  L. 
Thurstone,  a  test  which  was  known  to  be  heavily  saturated  with  a  per¬ 
ceptual-speed  factor.  It  was  assumed,  and  quickly  proved  in  later  factor 
analyses,  that  the  Speed  of  Identification  test  was  indeed  a  quite  pure 
measure  of  the  perceptual-speed  factor,  and  almost  no  further  work  was 
done  on  the  test  until  the  very  end  of  the  program  when  a  new  test, 
without  face  validity,  was  constructed  for  comparative  purposes. 

On  the  other  hand,  tests  sampling  spatial  abilities  were  a  focus  of 
interest  during  the  entire  period  of  activity  in  test  construction  and  re¬ 
search.  They  were  known  to  be  valid,  but  no  very  pure  measures  ex¬ 
isted.  It  eventually  became  clear  that  at  least  two  spatial  factors  were 
involved.  Efforts  to  define  these  reference  abilities  more  sharply  and 
to  determine  their  validities  were  still  being  made  at  the  end  of  the  war. 

The  organization  of  the  chapters  in  this  section  of  the  volume  docs 
not  completely  follow  the  coding  system.  Visualization  tests,  for  example, 
do  not  appear  at  all,  since  it  seemed  that  they  suited  an  intellectual  clas¬ 
sification  better.  The  chapter  headings  represent  a  compromise  between 
the  desire  for  classification  according  to  primary  reference  abilities  and 
the  present  need  for  a  priori  classifications  in  areas  yet  unexplored  with 
the  tool  of  factor  analysis. 

RESPONSIBILITY  FOR  PERCEPTUAL  TEST  DEVELOPMENT 

In  the  original  division  of  research  responsibility  in  the  Aviation  Psy¬ 
chology  Program,  a  perceptual  research  unit  was  activated  in  April 
1942,  at  the  headquarters  of  the  AAF  Training  Command.  This  unit 
constructed  both  motion  picture  and  printed  aptitude  tests  for  air  crew. 
In  October  1943,  the  responsibility  for  these  two  media  was  divided.  A 
psychological  test  fihn  unit  was  activated  to  continue  research  with  mo¬ 
tion  pictures,  and  the  responsibility  for  printed  tests  of  perceptual  abili¬ 
ties  was  transferred  to  Psychological  Research  Unit  No.  3.  In  November 
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1944,  with  the  deactivation  of  Psychological  Research  Unit  No.  3  and 
the  transfer  of  its  personnel  to  Psychological  Research  Unit  No.  2,  the 
responsibility  for  all  printed  air-crew  aptitude  tests  was  assigned  to  the 
latter  unit.  Research  on  perceptual  tests  was  also  carried  on  by  those 
concerned  with  the  construction  of  the  AAF  Qualifying  Examination.* 

BIBLIOGRAPHY 

(I)  Bentley,  M.,  The  New  Field  of  Psychology ,  New  York,  D.  Appleton-Century 
Co.,  1934. 

*  For  a  report  o f  (hit  activity,  see  Report  No.  6  of  this  series.  There  will  be  no  descriptive 
treatment  in  this  volume  of  the  perceptual  tests  constructed  for  the  Qualifyinc  Examination. 
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CHAPTER  SIXTEEN 


Perceptual  Speed  Tests1 


RATIONALE  FOR  PERCEPTUAL  SPEED  TESTS 
General  Statement  .  . 

The  study  of  perception  is  traditional  in  the  history  of  experimental 
psychology.  In  fact,  probably  no  phase  of  human  behavior  has  been  so 
extensively  examined  in  experimental  laboratories.  While  the  known 
facts  concerning  the  perception  of  normal  human  individuals  are  nu¬ 
merous,  the  analysis  of  perceptual  activities  into  fundamental  abilities 
and  the  measurement  of  individual  differences  in  those  abilities  has  not 
kept  pace  with  laboratory  studies.  Only  in  recent  years  has  the  factor- 
analysis  approach  been  brought  to  bear  upon  the  description  of  separate 
and  distinct  abilities  in  this  area. 

It  does  not  lie  within  the  scope  of  this  volume  to  present  a  review  of 
the  work  done  before  the  war.  It  is  desirable,  however,  owing  to  the 
fact  that  the  treatment  of  perceptual  tests  in  this  volume  is  tied  up  with 
factorial  considerations,  to  refer  briefly  to  certain  previous  factorial  in¬ 
vestigations. 

Early  Factor  Studies 

Perhaps  the  most  objective  attempt  to  isolate  and  to  classify  the  vari¬ 
ables  involved  in  perception  can  be  found  in  Thurstone’s  work.  From 
a  matrix  of  intcrcorrclations  of  56  psychological  tests,  he  extracted  and 
named  seven  "primary”  factors,  one  of  which  was  labeled  perceptual 
(factor  P).  A  study  of  the  tests  saturated  with  factor  P  “indicated  that 
the  perceptual  factor  might  consist  in  a  facility  to  perceive  detail  even 
when  it  is  buried  among  perceptual  distractors.  .  .  .  The  characteristic 
that  seemed  to  be  common  .  .  t  was  the  readiness  to  discover  and  to 
identify  perceptual  details'*  (1).  This  hypothesis  was  sup|x>rtcd  by  the 
results  of  a  subsequent  analysis,  using  22  tests  taken  from  the  original 
56,  plus  9  new  tests  prepared  especially  to  help  define  the  perceptual 
factor  (2). 

In  another  study  (3),  Thurstonc  analyzed  the  relationship  of  43  in¬ 
dividually-administered  laboratory  tests,  each  of  which  was  designed  to 
measure  some  aspect  of  perception.  Several  new  factors  were  identified 
as  perceptual  in  nature.  These  included:  Speed  and  strength  of  closure 

*  Wrilttn  by  S/S*t.  W«yn«  S.  Zimmerman, 
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— the  ability  to  hold  figures  in  mind  without  losing  their  identity  or 
shape;  geometric  illusion — the  ability  to  resist  the  effect  of  certain  geo¬ 
metric  illusions;  reversal  in  perception — the  tendency  to  see  rapid  alter¬ 
nating  effects;  and  freedom  from  Gcstaltbindung — flexibility  in  manipu¬ 
lating  several  more  or  less  irrelevant  or  conflicting  Gestalts. 

Perceptual  Speed  in  Aviation  Psychology 

Analysis  of  the  reports  of  pilot  instructors  revealed  that  in  14  per¬ 
cent  of  all  student  failures  studied  a  lack  of  speed  in  making  decisions 
and  in  reacting  was  mentioned  as  a  cause  for  elimination.  The  trait  of 
"speed  of  decision  and  reaction"  was  described  as  the  ability  to  think 
quickly,  to  make  rapid  decisions,  and  to  respond  with  speed  and  preci¬ 
sion  when  the  situation  demands.  Some  typical  comments  by  the  in¬ 
structors  regarding  eliminated  students  were :  "Siow  to  think  and  act  in 
the  air,”  "suffers  from  indecision,"  "unable  to  make  rapid  decision," 
"choice  of  fields  slow  and  unsatisfactory,"  and  "slow  reaction  time.” 
No  specific  mention  was  made  of  speed  of  perceiving  or  apprehending 
a  path  or  a  pattern,  or  of  speed  of  distinguishing  meaningful  visual 
detail,  although  the  lack  of  such  abilities  may  be  partially  responsible  for 
remarks  such  as  those  just  cited. 

In  the  report  of  student  failures  referred  to  in  the  preceding  para¬ 
graph,  lack  of  judgment  was  the  most  frequently  listed  cause  of  elimi¬ 
nation.  To  what  degree  evaluations  of  judgments  as  good  or  poor  depend 
upon  quickness  of  decision  is  not  known.  It  is  reasonable  to  assume, 
however,  that  slowness  in  perception  is  a  contributing  factor. 

Tests  that  were  found  to  measure  perceptual  speed  are  described  and 
discussed  in  this  chapter  under  the  subheadings  (1)  Speed  of  Appre¬ 
hending  Perceptual  Detail,  and  (2)  Gcrical  Speed. 

TESTS  OF  SPEED  IN  APPREHENDING  PERCEPTUAL  DETAIL 

Tests  described  within  this  group  have  in  common  problems  that  ap¬ 
pear  to  demand  the  rapid  visual  perception  of  detail  or  the  recognition 
of  similarities  and  differences.  A  comparison  of  form  and  design  and 
an  identification  of  patterns  or  details  that  may  be  buried  among  per¬ 
ceptual  distractors  arc  involved.  The  individual  items  arc  simple;  a  good 
score  depends  almost  entirely  upon  the’  rapidity  with  which  the  exami¬ 
nee  can  perceive  the  details. 

Speed  of  Identification,  CP610A  * 

This  test  was  designed  to  measure  speed  and  accuracy  of  form  per¬ 
ception.  The  speed  with  which  identical  airplane  silhouettes  can  be  iden¬ 
tified  by  quickly  noting  differences  and  similarities  of  form  was  be¬ 
lieved  to  be  indicative  of  the  prospective  air-crew  member’s  ability  to 

•  DtrtloMd  il  lh<  Office  of  The  Air  Surgeon,  Hctdquirltr*,  Army  Air  F«r«ci  CUcf  con¬ 
tributor, :  U.  Col.  l*»ul  M,  Kill,  ind  Stiff. 
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perform  rapidly  the  task  of  identifying  enemy  aircraft,  taking  instru¬ 
ment  readings,  noting  airplane  altitudes,  recognizing  landmarks,  and 
accomplishing  other  activities  demanding  speed  and  precision  of  per¬ 
ception. 

Description. — The  test  sheet  is  divided  into  12  panels.  On  the  left  in 
each  panel  arc  silhouettes  of  four  airplanes.  On  the  right  are  silhouettes 
of  five  airplanes,  four  of  which  are  identical  with  the  four  on  the  left 
and  one  of  which  is  different.  The  planes  at  the  right  are  rotated,  in  a 
haphazard  manner,  intc  different  positions  than  those  at  the  left.  For 
each  plane  on  the  left,  the  examinee  must  find  the  matching  one  on  the 
right.  The  test  is  printed  directly  on  an  IBM  answer  sheet.  Under  each 
plane  in  the  left  of  a  panel  arc  answer  spaces  marked  A,  B,  C,  D,  and  E. 
The  five  planes  on  the  right  arc  labeled  A,  B,  C,  D,  and  F.  to  correspond. 
Thus,  if  plane  B  on  the  right  is  identical  with  the  top  plane  on  the  left, 
that  top  plane  should  have  space  B  blackened  below  it. 

(1)  Interned  characteristics. — A  practice-test  sheet  is  provided,  con¬ 
taining  one  panel  of  four  recorded,  but  unscorcd,  practice  item?.  The 
test  contains  12  panels,  with  a  total  of  48  recorded  and  scored  items 
printed  on  2  sides  of  an  IBM  answer  shcei. 

(2)  Administration. — After  the  directions  arc  read  and  the  signal  to 
begin  the  test  is  given,  the  perforated  directions  sheet  is  tom  away  from 
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the  test  sheet  by  the  examinee.  The  administration  time  is  5  minutes, 
while  4  minutes  are  allowed  for  completing  the  test  items,  making  a  total 
testing  time  of  9  minutes.  Practice  items  are  shown  in  figtire  16.1. 

(3)  Scoring. — From  1  Decemlwr  1942  to  10  July  1943,  th  is  test  was 
scored  (R--\V)/2.  Both  before  and  after  that  period  the  scoring  form¬ 
ula  was  R  —  VV. 

Statistical  results. — Having  been  a  classification-battery  test,  Speed  of 
Identification  was  extensively  analyzed. 

(1)  Distribution  statistics. — Typical  examples  of  distribution  statis¬ 
tics  arc  given  in  tabic  16.1.  Tb**  distribution  curves  are  negatively  skewed 
and  considerably  flatter  than  norma). 


Tari.e  16.1. —  Distribution  constants  for  Speed  of  Identification ,  CP6WA 


Croup 

Psychological 
rcwirch  unit  No. 

N 

M 

SD 

Scoring 

formula 

Unclastificd  aviation  nudenti 

1.  2.  3 

3.000 

33.3 

7.3 

R-W 

Do . 

2 

1.520 

25  6 

6.9 

R— W 

Do .  . 

1 

2.729 

31.3 

7.5 

R-W 

1 

1.090 

14.9 

3.8 

<R-W)/2 

Clarified  pilot*1 . 

1.  2,  3 

*W 

31.6 

7.6 

R— W 

Oii'iM  navigator! . 

3 

•  mm 

32.6 

7.5 

R-W 

Kailio  operator** . . . . 

1.  2.  J 

11 

14.7 

3.9 

(R-W)/2 

120 

14.4 

3.8 

tR  — W>/2 

West  I’oint  cadctv* . 

. 

IU 

38.7 

7.2 

>0 

1 

z 

» Uvxt  43J  ana  43K. 

*  Previously  eliminated  from  pilot  training. 

•  CUaa  of  1944. 


(2)  Internal  consistency. — The  degree  of  homogeneity  of  the  items  is 
indicated  by  a  mean  internal-consistency  phi  of  0.09,  a  standard  devia¬ 
tion  of  the  phi  distribution  of  0.09,  and  a  range  of  values  from  —0.09 
to  0.36.  The  phi  values  arc  low,  because  the  test  is  highly  speeded.  These 
statistics  arc  based  upon  analysis  of  the  responses  of  the  highest  27 
percent  and  the  lowest  27  pe  rcent  in  total  score  of  a  group  of  700  un¬ 
classified  aviation  students  tested  in  April  1942  at  Psychological  Re¬ 
search  Unit  No.  3. 

(3)  Reliability  coefficients. — Several  samples  yielded  the  estimates  of 
reliability  given  in  table  16.2. 


Taoi.e  162. —  Reliability  coefficients  for  Speed  of  Identification ,  CP610A,  based 
upon  groups  of  unclassified  olio  I  ion  students 


r,M 

rB 

0.61 

0  76 

.49 

. . " . . 

63 

.  .  .  Di* . 

*23 

.57 

1 

•  Tr.ivJ  »i  Mr  lrfil  and  I'tjcholotxoi  Eaamimng  Unit  No.  8.  Dal*  of  t*»iing  Ml  reported; 

•t-i»  ♦«  Miy  IS4S.  , 

•  Kvr  r«|^rurifnui  l-*tr|****e*  iK*  *rM  dtvislrd  »«»•  limed  mwee*. 

•  Tr*ir«l  At  Rr-tiftk  I’m*  N«.  t  »f»  |une  IMi 

•  Kir»t  4  tnmgff*.  motul  J 

•  Kir>|  jn«!  -rfuni)  i-"  •ni'!t*hont,  4  mmutfl 

•  Trvtrd  It  l'i)d:ok'<  I  Kr-varih  I'mi  NV  I  in  Auguit  1942. 

•  l'ir»l  a  '  ■  irniiiiiiin.  4  Msotvl  ad«nini*l»»!:an,  4  minu.'rv  . 

•  Latrcmcly  low  cocrvUnon  Jut  I*  many  cadru  fcmtktng  all  item*  in  rcttO. 
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(4)  Difficulty. — Basal  upon  the  responses  of  the  above-mentioned 
sample  of  700  unclassified  aviation  students,  the  test  yielded  a  mean  pro¬ 
portion  of  conect  responses  of  0.94,  corrected  for  chance,  with  a  range 
from  0.10  to  0.99,  and  a  standard  deviation  of  0.08. 

(5)  Factorial  composition. — The  most  significant  loading  is  in  the 
perceptual-speed  (0.‘j4)  factor  in  which  the  test  is  almost  pure.  The 
communaiity  is  0.63.  Tor  a  full  picture  of  the  factorial  composition  of 
this  test,  sec  appendix  B. 

(6)  Test  validity. — Validation  results  based  on  several  samples  arc 
given  in  tables  16  3  to  16.7  inclusive. 


Table  16.5. —  Validity  data  for  Speed  of  Identification ,  CP610A,  using  seven  grades 
in  navigation  training  as  criteria,  for  a  sample  of  463  navigation  trainees  in  Hondo 

classes  43-J0  through  43-15 


Grade 

f* 

r..rr  ' 

r« 

'•r.rr  * 

D-ad  reck  ling  (grci’iid  school)  .. 
Celt  stiai  navigation  ({round  school) 

Dea  l  reckoning  (fli  :it) . . . 

Celestial  navigation  (Bight)  . 

0.06 

.Oft 

.11 

.08 

0.11 

.12 

.14 

.12 

Meteorology  . 

Military  . 

Final  composite . 

0.09 

-.08 

.09 

0.13 

-.06 

.!S 

’  Product  moni'nt  correlation*. 

1  Assumed  unrestricted  staninc  standard  deviation  not  reported. 


Variations. — The  B  form  of  Speed  of  Identification  was  to  have  been 
a  lantern-slide  adaptation  using  the  items  of  the  original  test.  It  was  sug¬ 
gested  that  tachifoscopic  exposures  would  result  in  a  purer  measure  of 
perceptual  spec  an  the  temporally  uncontrolled  printed  administra¬ 
tion.  This  slide  adaptation  was  never  fully  developed,  so  data  to  verify 
or  disprove  the  hypothesis  are  no  vailable. 

The  C  form  was  prepared  for  a  two-fold  purpose:*  (1)  To  remove 
the  possible  influence  of  aviation  interest  by  constructing  items  utilizing 
meaningless  symbols  rather  than  airplane  >,  and  (2)  to  reduce  the  item 
difficulty  as  close  to  zero  as  possible  in  order  to  provide  a  pure  speed  test. 

For  the  purposes  of  factor  analysis,  a  form  of  Speed  of  Identification 
was  constructed  with  the  response  alternatives  unrotated  from  their  origi¬ 
nal  position.4  This  version  is  without  a  code  designation. 

Evaluation. — Speed  of  Identification,  CP610A,  was  used  in  the  clas¬ 
sification  battery  from  March  1942  to  November  1943.  It  was  dropped  at 
that  time,  because  it  became  known  that  Spatial  Orientation  I  measured 
perceptual  speed  nearly  as  well  and,  furthermore,  was  more  valid  for 
pilots,  due  to  additional  factor  content.  Speed  of  Identification  was  rein¬ 
stated  in  September  1944  because  of  a  recognition  of  ihe  fact  that  it  was 
the  strongest  and  purest  measure  of  the  perceptual-speed  factor  thus  far 
constructed.  Although  its  average  primary  pilot  validity  of  0.18  is  lower 
than  that  of  either  Spatial  Orientation  I  or  II,  its  higher  loading  on  the 
pcrccptual-spced  factor  indicates  that  practically  all  of  its  validity  is  due 

•Developed  at  Psychological  Research  Unit  No.  J.  Chief  contributor:  Pfc.  Sidney  W.  Pinkd. 

'  Constructed  at  the  Psychological  Section,  Headquarter*,  AAF  Training  Command,  Fort 
Worth,  Tex. 
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to  saturation  with  this  factor.  Loadings  with  other  factors  (measured 
better  by  other  tests)  account  for  the  slightly  higher  validities  of  Spatial 
Orientation  I  and  II. 

In  an  analysis  which  included  both  the  rotated  and  nonrotated  forms 
of  Speed  of  Identification,  no  significant  differences  in  factor  structure 
between  the  two  forms  were  revealed  (sec  table  28.15).  It  might  be  ex¬ 
pected  that  rotation  would  increase  visualization  content,  and  there  is 
some  slight  indication  that  this  is  so;  but  apparently  the  perceptual  dif¬ 
ferences  in  design  of  the  airplanes  are  sufficiently  gross  that  the  examinee 
finds  it  unnecessary  to  rotate  mentally  an  image  of  the  object  in  order 
to  make  comparisons.  Thus,  only  speed  of  perception  is  involved  to  a 
significant  extent. 

Pursuit  Test,  CP414A  *  (Path  Tracing,  CP512A) 

It  was  thought  that  some  elements  of  foresight  and  planning  might  be 
involved  in  tests  like  the  McQuarric  Path  Tracing  Test.  A  modification 
of  this  test  was  therefore  prepared  for  group-test  administration  to  air¬ 
crew  candidates. 

Description. — Items  arc  arranged  in  blocks  of  10.  Down  the  left-hand 
side  of  each  block  are  10  numbers,  and  down  the  right-hand  side  of  the 
block  are  10  letters.  Each  number  is  connected  by  an  irregularly  curved 
line  to  a  letter  on  the  opposite  side  of  the  block.  Thus  a  maze  of  lines  is 
formed.  The  examinee’s  task  is  to  trace  visually  each  line  from  its  be¬ 
ginning  to  its  termination  and  to  mark  the  appropriate  letter  opposite 
the  item  number  on  the  separate  answer  sheets. 


FIGURE  16.2 

SAMPLE  PROBLEMS  OF  PURSUIT  TEST, 
CP4I4A 


*  Developed  at  Paychologkal  Reaearcb  Unit  No.  3. 
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(1)  Internal  characteristics. — The  Pursuit  Test  is  divided  into  two 
parts,  separately  assembled.  It  is  mimeographed.  The  Path  Tracing  test 
is  the  printed  version  of  Part  I  of  the  Pursuit  Test.  The  directions  con¬ 
tain  10  recorded,  but  unscored,  practice  items.  Each  part  of  the  test  con¬ 
tains  80  scored  items,  in  8  blocks  of  10  items  each. 

(2)  Administration.— Answers  arc  marked  directly  on  two  15-place 
answer  sheets.  Five  minutes  arc  allowed  to  complete  80  items.  The  sam¬ 
ple  items  are  shown  in  figure  16.2, 

Following  is  part  of  the  directions : 

In  the  example,  you  will  find  10  lines  which  run  across  the  diagram  from  left  to 
right.  The  beginning  of  each  line  is  numbered.  Notice  that  line  1  begins  in  the  upper 
left  corner  and  ends  at  the  letter  II.  Opposite  item  1  on  your  answer  sheet  blacken 
the  space  und'r  H. 

Similarly,  notice  that  line  2  ends  at  letter  J  and  that  line  3  ends  at  letter  D.  For 
items  2  and  3  on  your  answer  sheet,  blacken  the  space  under  J  and  D  respectively. 

(3)  Scoring. — The  scoring  formula  is  R— VV. 

Statistical  results. — Data  arc  available  for  this  test  on  small  samples. 
With  the  exception  of  the  rcliabilitj'  coefficient,  all  data  arc  for  Path 
Tracing,  or  for  Part  I  of  the  Pursuit  Test. 

(1)  Distribution  statistics — Distribution  constants  are  presented  in 
table  16.8. 


Table  16.8. —  Distribution  constants'  for  Pursuit  Test,  CP4HA,  and  Path  Tracing, 

CP512A 


Form 

Group 

N 

M 

SO 

rP4|4A  . 

199 

46  5 

«.* 

CPSI2A  . 

Unclassified  aviation  students*  ..  .... 

460 

47.3 

7.« 

*  Scored  right!  only.  ..... 

1  In  class  44A.  Tested  at  Psychological  Research  Unit  Vo.  J. 

*  Tested  at  Psychological  Research  Unit  No.  J  in  March  1945. 


(2)  Reliability  coefficients. — By  the  alternate- forms  (part  I-part  II) 
method,  an  estimated  reliability  coefficient  of  0.80,  corrected  for  length, 
was  obtained.  This  figure  is  based  on  a  sample  <  f  210  unclassified  avia-  r 
tion  students  tested  in  April  1942  at  Psychological  Research  Unit  No.  3. 

(3)  Factorial  composition. — For  Path  Tracing,  CP512A,  the  most 
prominent  loadings  are  in  the  perceptual-speed  (0.51),  planning  (0.27), 
numerical  (0.25)  and  spatial  (0.17)  factors.  The  communality  is  0.50, 
to  be  compared  with  the  uncorrected  reliability  of  0.66  for  the  Pursuit 
test.  For  a  full  pictruc  of  the  factorial  composition  of  this  test,  see  ap¬ 
pendix  B. 

(4)  Test  validity.— Validation  results  are  given  in  table  16.9. 


Table  16.9.—  Validity  data  for  Pursuit  Test,  CP4UA,  and  Path  Tracing  CP512A, 
based  upon  graduation-elimination  of  pilots  in  primary  training _ 


Form 

Class 

Psychological 
research 
unit  No. 

N. 

M, 

M, 

SI), 

rn. 

*rn* 

Vsa  4. 

Ni  Isj  **  J* 
»» 

44A 

4JJ 

44A.  44B,  44C 
43K 

3 

J 

l 

3 

0.86 

.82 

74 

.87 

199 

17) 

640 

186 

46.6J 

41.12 

47,91 

54.60 

45.67 
41.52 
46  08 
56  )* 

6.84 

6.86 

7.56 

7.28 

0.08 

-.02 

.14 

-.13  1 

•0.17 

*  Assumed  unrestricted  staninc 

•tandard  deviation  not  report**!. 
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Evaluation. — The  average  pilot  validity  of  Pursuit,  CP414A,  is  only 
0.09.  The  factor  content  of  Pursuit  is  puzzling.  In  Thurstone’s  analysis 
of  56  tests,  Pursuit  appeared  to  be  highly  saturated  with  factor  S 
(space).  In  a  re-analysis  by  aviation  psychologists  (not  covered  in  this 
report)  of  19  of  the  56  tests,  using  the  same  intercorrelations,  the  same 
factor  picture  was  derived  for  this  test.  Results  of  several  subsequent 
analyses  based  on  form  CP512A,  however,  showed  most  of  its  common 
variance  on  the  perceptual-speed  factor.  One  rationale  advanced  to  ex¬ 
plain  the  perceptual  content  is  that  speed  and  accuracy  of  response  is 
gained  by  rapidly  perceiving  the  details  of  a  path.  The  criss-crossing  of 
pathways  requires  considerable  close  examination  and  the  seeing  of  pat¬ 
terns  in  spite  of  entangling  distractors. 

Evaluation  of  Tests  of  Speed  of  Apprehend:ng  Detail 

The  similarity  of  Speed  of  Identification  and  Thurstone’s  Identical 
Forms  test  is  substantial  evidence  for  believing  that  his  factor  P  and 
the  perceptual-speed  factor  found  in  AAF  tests  are  one  and  the  same. 
Why  the  Pursuit  test  should  appear  to  be  primarily  a  spatial  test  in  one 
analysis  and  a  perceptual-speed  test  in  another  analysis  is  not  answered. 
Evidence  that  it  belongs  with  perceptual  speed  rather  than  with  spatial 
tests  is  strongly  supported  by  three  separate  analyses,  each  based  on  data 
collected  from  independent  samples.  It  is  possible  that  this  type  of  test 
is  very  sensitive  to  minor  changes  of  design;  i.  c.,  that  alterations  in 
drafting  the  pathways  may  call  for  distinct  functional  shifts  in  the  task 
of  examinees. 

Average  pilot  and  navigator  validities  for  Speed  of  Identification  of 
0.18,  and  other  evidence,  suggest  that  a  conservative  estimate  for  the 
validity  of  the  perceptual-speed  factor  would  be  between  0.15  and  0.20 
for  both  air-crew  specialties.  These  findings  bear  out  the  original  pre¬ 
diction  that  measurements  of  perceptual  ability  would  be  a  valuable  aid 
to  predicting  air-crew  success. 

CLERICAL  SPEED  TESTS 

Tests  described  within  this  group  arc  similar  in  that  the  tasks  involved 
are  clerical.  No  test  specifically  designed  to  measure  general  clerical 
ability  is  included,  but  the  individual  problems  are  similar  to  or  are 
related  to  clerical  tasks.  The  problems  include  reading  graphs,  dials, 
meters,  or  tables,  and  checking,  classifying,  or  filing  numbers.  Support 
for  including  tests  of  this  nature  may  be  found  in  the  average  ratings 
made  by  supervisors  of  combat  teams,  which  show  that  on  a  nine-point 
scale,  dial-and-table  reading  has  a  mean  rating  of  6.8  for  pilots  (see 
tabic  1.6).  Ratings  for  combat  navigators  give  dial-and-table  reading  a 
mean  rating  of  6.6  (see  table  1.4). 

Graph  Reading,  CP601B 

This  test  and  the  next  four  tests  described  in  this  chapter  are  Parts 
I  through  V  of  the  Quantitative  Perception  Tests  copyrighted  in  1941 
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by  the  Cooperative  Test  Service.  These  tests  were  adopted  and  used  for 
a  short  time  in  the  classification  battery  during  the  first  months  of  test* 
ing.  When  the  tests  were  selected,  it  was  too  early  in  the  program  fpr 
validation  data  to  be  available.  They  were  appealing,  because  the  test 
items  presented  tasks  that  air-crew  members  were  known  to  encounter 
to  some  extent  in  their  training  and  in  later  operations.  They  were  re* 
placed  by  apparently  better  designed  tests  of  similar  content  or  function 
before  validation  data  were  known. 

Description. — The  answers  to  all  of  the  test  problems  are  read  from  a 
graph  on  which  are  drawn  two  curves.  X  and  Y  axes  are  labeled,  and 
values  are  indicated  on  abscissa  and  ordinate.  Each  problem  gives  cither 
the  X  or  Y  value  for  one  of  the  curves,  and  the  examinee  is  required 
to  determine  the  other  value.  Two  alternative  answers  from  which  the 
correct  one  is  selected  are  provided  for  each  item. 

(1)  Internal  characteristics. — The  directions  contain  one  graph  from 
which  the  answers  to  eight  items  are  read.  Four  of  the  answers  are  in* 
dicated  correctly,  and  four  are  recorded  by  the  examinee  but  are  un¬ 
scored.  The  test  contains  1  graph  from  which  the  answers  to  52  recorded 
and  scored  items  are  read. 

(2)  Administration. — Each  examinee  receives  the  special  IBM  form 
on  which  are  presented  both  directions  and  items.  The  directions  are 
printed  on  one  half  of  one  side  of  an  IBM  answer  sheet.  The  items  are 
presented  on  the  other  half  of  the  page  and  are  arranged  so  that  they 
appear  upside  down  while  the  directions  arc  being  read.  Thus,  the  page 
must  be  turned  end  for  end  when  the  signal  to  begin  the  test  is  given. 
This  format  and  procedure  apply  to  the  other  tests  in  the  scries.  Three 
minutes  are  allowed  to  complete  the  52  test  items.  Sample  items  are 
shown  in  figure  16.3. 
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SAMPLE  PROBLEMS  OF  GRAPH  READING, 

CP60IB 
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Following  is  part  of  the  directions: 

Read  the  y-valucs  corresponding  to  the  values  of  x  given  below  for  curve  I  to  the 
nearest  whole  number;  repeat  for  curve  If.  The  spaces  under  the  correct  answers 
have  already  Item  marked  in  the  sample.  (See  fig.  16.3.) 

The  correct  answers  may  he  obtained  in  the  following  manner:  (1)  hind  the  fw>iu! 
on  the  x  axis  (the  heavy  horizontal  line  marked  OX)  where  x  is  equal  to  2;  (2) 
follow  the  heavy  vertical  line  upwards  until  you  find  the  point  that  it  crosses  curve 
I;  (3)  from  this  point  follow  across  horizontally  to  the  y  axis  (the  heavy  vertical 
line  marked  OY)  and  read  the  y  value,  which  is  3  in  this  case.  In  the  sample,  the 
answer  space  under  3  has  therefore  been  marked. 

(3)  Scoring. — The  first  scoring  formula  was  R— 3VV.  ft  was  changed 
to  R  —  2W,  and  finally  to  R  —  W. 

Statistical  results .  (1)  Distribution  statistics. — Typical  examples  of 
distribution  statistics  are  given  in  table  16.10. 


Table  16.10. —  Distribution  constants  for  Graph  Reading,  CP601B1 


Group 

N 

M 

1,134 

'226 

Mill 

365 

SS7 

Hof 

*  Oil  these  samples  scoring  formulas  could  not  be  determined  from  available  data. 

•Croup  not  identified. 

>  Tested  in  March  and  April  1942  at  Psychologies!  Research  unit  No.  1. 

»  Tested  in  the  period  Apr.  t  to  Aug.  14,  1942,  at  Psychological  Research  Unit  No.  3. 

(2)  Reliability  coefficient. — By  the  odd-even  method,  an  estimated  re¬ 
liability  coefficient  of  0.70,  corrected  for  length,  was  obtained  using  the 
scoring  formula  R— 3W.  Owing  to  the  fact  that  the  test  is  highly 
speeded,  this  figure  is  spuriously  high.  The  estimate  is  based  on  a  sample 
of  226  unclassified  aviation  students  tested  at  Psychological  Research 
Unit  No.  1  in  March  1942. 

(3)  Test  validity. — Validation  results  based-  on  several  samples  are 
given  in  table  16.11. 

Evaluation. — Graph  Reading  proved  to  have  high  validity  for  naviga¬ 
tors  and  a  moderate  validity  for  pilots. 

The  reliability,  although  low,  is  acceptable  for  inclusion  in  a  battery. 
The  test  is  comparatively  easy  to  administer  and  to  score. 

Meter  Reading,  CP602B 

This  is  Part  II  of  the  Quantitative  Perception  tests. 

Description . — Each  item  consists  of  a  diagram  of  a  portion  of  a  meter 
with  a  needle  indicating  a  reading.  The  examinee  is  required  to  read  each 
dial  to  the  nearest  whole  number.  In  nearly  all  items  some  of  the  divi¬ 
sions  between  numbers  are  removed  and  the  examinee  must  estimate 
needle  positions.  Two  alternative  answers  arc  listed  for  each  item. 

(1)  Internal  characteristics. — The  directions  contain  five  items  with 
the  correct  answers  already  marked  and  five  recorded,  but  unscored, 
practice  items.  The  test  contains  50  recorded  and  scored  items. 
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Unit  No.  1, 


(2)  Administration. — Each  examinee  receives  a  special  IBM  form  on 
which  l>oth  directions  and  items  arc  printed.  Three  minutes  arc  allowed 
for  completion  of  the  test  items.  Sample  items  are  shown  in  figure  16.4. 


FIGURE  16.4 

SAMPLE  PROBLEM  OF  METER  READING, 

CP602B 

(3)  Scoring. — The  first  formula  used  was  R— 3W.  It  was  changed  to 
R  —  2W,  and  finally  to  R  —  W. 

Statistical  results.  (1)  Distribution  statistics. — Typical  examples  of 
distribution  statistics  are  given  in  table  16.12. 


Table  16.12. — Distribution  constants  for  .Meter  Reading,  CP602B* 


Group 

N 

M 

234 

22.2 

5S7 

207 

J6S 

24.6 

*  On  iheM  tamplci  (coring  formulas  could  not  b«  determined  from  Mailable  data. 

1  Toted  in  March  and  April  1942  at  Psychological  Research  Unit  No.  1, 

»  Tested  in  the  period  April  1  to  August  14,  1942  at  Psychological  Research  Unit  No.  J. 


(2)  Reliability  coefficient.— "By  the  odd-even  method,  an  estimated 
reliability  coefficient  of  0.73,  corrected  for  length,  was  obtained  using  the 
formula  R— 3W.  Since  this  test  is  speeded,  the  value  is  probably  an 
overestimate.  The  coefficient  is  based  on  a  sample  of  234  unclassified 
aviation  students  tested  in  March  1942  at  Psychological  Research  Unit 


No.  1. 


(3)  Test  validity. — Validation  results  are  given  in  table  16.13. 
Evalaution. — Meter  Reading  proved  to  have  relatively  high  navigator 
validity  and  low,  but  definite,  pilot  validity.  Reliability  is  minimum  sat¬ 
isfactory.  Administration  and  scoring  are  comparatively  easy. 


Table  Readings  CP603B 

This  is  Part  III  of  the  Quantitative  Perception  tests. 

Description. — Each  item  requires  the  examinee  to  check  whether  or 
not  the  listed  square  or  square  root  of  a  number  is  correct.  A  table  of 
squares  and  square  roots  is  provided  for  this  purpose. 

(1)  Internal  characteristics. — The  directions  contain  3  sample  items 
and  a  table  of  squares  and  square  roots  of  the  numbers  from  25  to  50. 
Four  of  the  items  arc  already  answered  correctly,  and  four  are  to  be  an¬ 
swered  by  the  examinee.  The  sample  items  are  not  scored.  The  test  con¬ 
tains  a  table  of  squares  and  square  roots  of  the  numbers  from  51  to  100, 
and  48  recorded  and  scored  items. 
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(2)  Administration. — Fach  examinee  receives  a  special  IBM  form  on 
which  both  directions  and  items  are  presented.  Three  minutes  arc  allowed 
to  complete  the  test.  Sample  items  arc  shown  in  figure  16.5. 
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SAMPLE  PROBLEM  OF  TABLE 
CP603B 
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Following  is  part  of  the  directions : 

The  •  •  •  tabic  give :  the  squares  and  square  roots  of  numbers  from  25  to  50. 
Look  up  in  this  table  the  squares  of  the  numbers  in  column  (1).  If  the  answer  given 
is  right,  mark  the  space  under  R;  if  it  is  wrong,  mark  the  space  under  W.  Then  look 
up  the  square  roots  of  the  numbers  in  column  (2)  and  blacken  the  answer  space 
under  R  if  the  answer  given  is  right  or  W  if  it  is  wrong. 

(3)  Scoring. — When  the  test  was  first  introduced  into  the  battery,  the 
scoring  formula  was  R— 3W.  This  was  changed  to  R  —  2W  and  finally 
toR  — W. 

Statistical  results.  (1)  Distribution  statistics. — Typical  examples  of 
distribution  statistics  arc  given  in  table  16.14. 


Taci.ic  16.14. —  Distribution  constants  for  Table  Reading,  CP603B * 


Croup 

N 

M 

so 

557 

26  1 

4.* 

us 

us 

T.J 

1  On  tbcic  gorin  '  formulas  could  not  be  determined  from  Available  data. 

•Voted  from  Apr.  I  lo  A  ;.  14,  1VI2  at  i‘ijrc  bo  logical  Research  Unit  No.  ). 


(2)  Reliability  coefficient. — By  the  odd-even  method,  an  estimated 
reliability  coefficient  of  G.85,  corrected  for  length,  was  obtained  using 
the  scoring  formula  R  — 3W.  Since  the  test  is  speeded,  the  coefficient  is 
spuriously  high.  This  figure  is  based  on  a  sample  of  234  unclassified 
aviation  students  tested  in  March  1942  at  Psychological  Research  Unit 
No.  1. 

(3)  Test  valulity. — Validation  results  based  on  several  samples  are 
given  in  table  16.15. 
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Evaluation. — One  relatively  small  samples,  Table  Reading  proved  to 
have  moderate  to  relatively  high  validity  for  predicting  success  in  navi¬ 
gator  training  and,  on  a  very  large  sample,  zero  validity  for  pilots.  Test 
CE603B  probably  has  a  satisfactory  reliability  and  is  easy  to  administer 

and  to  score. 

Number  Reading  (Filing),  CP604B 

This  is  Part  IV  of  the  Quantitative  Perception  tests. 

Description. — Each  item  presents  a  number  to  be  filed  and  two  alter¬ 
native  numbers  after  which  it  might  be  filed.  One  master  table  listing  all 
of  the  numbers  in  the  file  accompanies  the  items.  The  examinee  is  re¬ 
quired  to  determine  which  of  the  two  alternative  numbers  the  number  to 
be  filed  should  follow. 

(1)  Internal  characteristics. — The  directions  contain  a  master  table 
and  seven  sample  items,  two  of  which  have  the  correct  answers  marked 
and  five  of  which  the  examinee  is  required  to  answer.  Sample  items  are 
not  scored.  The  test  contains  one  master  table  and  50  recorded  and 
scored  items. 

(2)  Administration. — Each  examinee  receives  a  special  IBM  form  in 
which  bc'h  directions  and  items  are  printed.  Three  minutes  are  allowed 
to  complete  the  test  items.  Sample  items  appear  in  figure  16.6. 


Following  is  part  of  the  directions : 

Column  A  is  a  column  of  numbers  in  order  of  sire.  In  column  B  are  numbers  to 
be  filed  after  the  correct  numbers  in  A.  The  answers  are  given  after  each  number. 
Find  the  correct  answers  by  deciding  after  which  number  in  A  you  will  file  the  num¬ 
ber  you  are  working  on,  then  blacken  the  space  under  the  correct  answers. 

(3)  Scoring. — When  the  test  was  introduced  into  the  classification 
battery,  the  scoring  formula  was  R— 3W.  This  was  changed  to  R  —  2W 
and  finally  to  R  —  W. 

Statistical  results.  (1)  Distribution  statistics. — Typical  examples  of 
distribution  statistics  are  given  in  table  16.16. 
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Tabu  16.16. —  Distribution  constants  for  Number  Reading,  CP6MF 


Croup 

N 

U 

SD 

Unclassified  avUtion  students*  . . . 

214 

14.4 

7.4 

Bombardiers*  . . . . . . . . 

238 

157 

7.4 

Navigators*  . . . . 

194 

18.1 

74 

'  On  these  samples  scoring  formulas  could  not  he  determined  from  available  data. 

•Tested  in  March  1942  at  Psychological  Research  Unit  No.  I. 

1  Tested  from  Apr.  1  to  Aug.  14.  1942  at  Psychological  Research  Unit  No.  2. 

(2)  Reliability  coefficient. — By  the  odd-even  method,  an '  estimated 
reliability  coefficient  of  0.93,  corrected  for  length,  was  obtained  using 
the  scoring  formula  R— 3W.  Since  the  test  is  speeded,  this  coefficient  is 
spuriously  high.  This  figure  is  based  on  a  sample  of  234  unclassified 
aviation  students  tested  in  March  1942  at  Psychological  Research  Unit 
No.  1. 

(3)  Test  validity. — Validation  results  based  on  several  samples  are 
given  in  table  16.17. 

Table  16.17. —  Validity  data  for  Number  Reading  ( Filing  J,  CP604B,  uring  the 
graduation-elimination  criterion* 


Group 

N, 

! 

SD, 

rMs 

394 

0.88 

10.4 

13.6 

7.0 

0.44 

Do*  . 

163 

.71 

19.8 

12.5 

7.9 

.55 

Do*  . 

392 

.77 

9.7 

1  •*  ™ 

13.8 

7.7 

.45 

•On  these  samples  scoring  formulas  could  rot  be  dftirmlncd  from  available  dattL 
•New  students  and  eliminated  pilots,  tested  from  qpt.  1  to  Aug.  14,  1942  at  Psychological 
Research  Unit  No.  3.  ....  _  .. 

•New  students  in  classes  42-11  to  42-16  at  Hor.dtl  Army  Air  field  tested  at  Psychological 
Research  Unit  No.  2.  ■  .  . 

*  Eliminated  pilots.  Same  classes  and  tested  in  same  Spit  *i  in  footnote  3, 

Evaluation. — On  rclat’vely  small  samples,  Number  Reading  proved  to 
have  high  navigator  validity.  The  test  has  satisfactory  reliability  and  is 
easy  to  administer  and  to  score. 

Number  Size,  CP605B 

This  is  part  V  of  the  Quantitative  Perception  tests. 

Description. — The  examinee’s  task  in  this  test  is  to  scan  rapidly  rows 
of  numbers  and  to  underline  those  that  fall  within  a  specified  number 
range. 

(1)  Internal  characteristics, — The  tett  is  divided  into  five  sections. 
Each  section  contains  75  numbers  ranging  from  1  to  99.  There  are  70 
numbers  to  be  underlined,  making  a  total  possible  score  of  70,  if  all  the 
correct  numbers  are  underlined  and  alt  of  the  incorrect  numbers  are  not. 

(2)  Administration. — Each  examinee  receives  a  special  IBM  form  on 
which  both  directions  and  items  are  printed.  Two  minutes  are  allowed 
to  complete  the  test  items.  Sample  problems  from  the  test  are  shown  in 
figure  16.7. 
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FIGURE  16.7 

SAMPLE  PROBLEMS  OF  NUMBER  SIZE, 
CP605B 


Following  is  part  of  the  directions: 

In  each  of  (he  sections  the  answer  spaces  under  certain  numbers  arc  to  be  black¬ 
ened.  Follow  the  directions  given  at  the  beginning  of  each  section.  Work  as  rapidly 
as  you  can  without  making  mistakes;  an  error  will  deduct  three  points  from  your 
score. 

(3)  Scoring. — The  scoring  formula  used  is  R— 3W. 

Statistical  results.  (1)  Distribution  statistics. — Typical  examples  of 
distribution  statistics  arc  given  in  table  16.18. 


Table  16.18. —  Distribution  constants  for  Number  Sice,  CP605A 


Group 

N 

SD 

238 

37.$ 

8.9 

194 

39.8 

7.4 

*  Tested  from  Apr,  t  to  Aug.  14,  1942  at  Psychological  Research  Unit  No.  I. 


(2)  Test  validity. — Validation  results  based  on  several  samples  are 
given  in  table  16.19. 


Table  16.19. —  Validity  data  for  Number  Sice,  CP605B  using  the  graduation- 

elimination  criterion 


Group 

N 

mi 

M. 

M. 

SD, 

Pilots  In  primary 

training1  . 

676 

0.60 

39.0 

■H 

;.i 

0.10 

Navigators'  . 

194 

.88 

40.8 

7.4 

.41 

16) 

.71 

39.0 

8-7 

.16 

392 

.77 

40.3 

■EZl 

.!« 

■  an  uaj'ei  **eg  anu  <iai, 

*  New  >iuilrnts  and  eliminated  pilots.  Tested  Irom  Apr.  I  to  Aug.  14,  1942  at  Psychological 
Research  Unit  No.  J. 

'New  students.  In  classes  42-11  to  42-16  at  Hondo  Army  Air  Field.  Tested  at  Psychological 
Research  Umt  Ne.  2. 

'  Eliminated  pilots.  Classes  and  testing  unit  same  as  in  footnote  3. 


Evaluation . — Number  size  showed  a  moderate  validity  for  navigators 
and  a  very  low  validity  for  pilots. 
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Dial  Reading  and  Table  Reading,  CP622A  and  CP621A  * 

In  some  phase  of  their  work,  pilots,  bombardiers,  and  navigators  all 
find  it  necessary  to  observe  dials  and  to  consult  printed  tables.  An  activ¬ 
ity  thus  common  to  three  air-crew  specialties  could  hardly  escape  early 
notice  in  the  job  analyses.  Moreover,  dial  reading  and  table  reading  are 
activities  that  lend  themselves  very  readily  to  printed-test  presentation. 
As  was  true  of  most  early  tests  whose  underlying  content  had  yet  to  be 
revealed  by  factor  analysis,  the  rationale  for  the  adoption  of  dial-and 
table-reading  tests  was  the  common  sense  one  of  merely  expecting  vali¬ 
dity  commensurate  with  the  extent  to  which  air-crew  members  depend 
upon  an  activity  such  as  that  measured  by  this  test. 

Description. — As  suggested  by  the  title  and  the  code  numbers,  the  test 
booklet  is  a  composite  of  two  tests  which  once  were  treated  as  separate 
units. 

CP622A,  the  first  section  of  the  booklet,  is  the  Dial  Reading  test,  oc- 
copying  the  first  eight  pages.  On  a  typical  page  a  bank  of  seven  dials  is 
drawn  on  the  upper  half  and  repeated  with  different  needle  portions  on 
the  lower  half  of  the  page.  The  dials  are  labeled:  RPM,  Airspeed,  Alti¬ 
tude,  Voltmeter,  Temperature,  Fuel-Air  Ratio,  Amperes.  The  dials  dif¬ 
fer  widely  in  graduation  ranging  from  2,500  units  portrayed  on  the 
RPM  dial  down  to  only  five  units  on  the  Fuel-Air  Ratio  dial.  Indicator 
needles  point  to  given  values  on  the  various  dials.  Below  each  printing 
of  the  seven  dials  are  six  items  calling  for  certain  readings.  The  exami- 


SAMPLE  PROBLEMS  I 

A 

B 

C 

0 

C 

1.  R.P.  M. 

91. 

69.5 

9.5 

92. 

IOS 

2.  AMPERES 

16. 

-IS. 5 

14. 

-10 

-2 

3.  altituoe 

I57.S 

15.5 

155 

1.5 

152 

FICURE  16.  8 

SAMPLE  PROBLEMS  OF  OIAL  REAOING, 

CP622A 
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triWtar :  U.  Frank  J.  Dwki  TV  acvowl  tl  >bu  tfU  w *•«»**  fcf  U  Jtk«  W.  lltH,  Jf* 
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nee’s  task  is  first  to  look  at  the  item  to  see  which  dial  is  to  be  read ;  next, 
to  locate  the  dial  and  read  it  correctly  (interpolating  if  necessary) ;  then 
to  identify  this  reading  among  the  five  choices  offered;  and,  finally,  to 
mark  this  choice  on  the  separate  answer  sheet.  In  figure  16.8  are  shown 
the  dials  and  sample  problems. 

CP621A,  the  Table  Reading  test,  is  divided  into  two  parts.  Part  I 
makes  use  of  a  large  rectangular  table  of  35  columns  and  35  rows,  pre¬ 
senting  respectively  first  values  and  second  values.  The  center  column 
and  row  are  labeled  O  Negative  values  are  listed  in  the  columns  to  the 
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left  of  center  and  the  rows  below  the  center,  while  positive  values  rrrvl 
upwards  and  to  the  right  of  center.  Each  test  item  lists  two  values;  one 
first  value  and  one  second  value.  The  examinee's  task  is  to  locate  the 
correct  column  and  correct  row,  read  the  entry  at  their  intersection, 
identify  it  among  the  five  choices  listed  for  the  item,  and  mark  the  choice 
on  his  answer  sheet.  In  figure  16.9  is  the  table  and  sample  probtems. 

Part  II  of  CP621A  uses  four  tables  of  fairly  complex  construction, 
containing  five  variables  associated  with  the  flight  of  an  airplane.  The 
five  variables  are  air  speed,  angle  of  wind,  velocity  of  the  wind,  correc¬ 
tion  for  drift  with  the  wind,  and  ground  speed.  The  first  three  are  con¬ 
sidered  the  independent  variables  by  which  the  last  two  are  determined. 
Thus,  in  each  of  the  45  items,  the  air  speed,  the  wind  angle,  and  the  wind 
velocity  are  given,  and  the  examinee  is  required  to  find  either  the  drift 
correction  or  the  ground  speed.  Four  specific  air  speeds  determine  which 
of  the  four  tables  should  be  entered;  wind  velocity  determines  the  col¬ 
umn;  and  wind  angle  determines  the  row.  Figure  16.10  shows  a  sample 
item  and  table. 

(1)  Internal  characteristics. — The  directions  for  dial  reading  contain 
three  sample  items  with  the  correct  answers  marked  and  three  recorded. 


SAMPLE  PROBLEMS: 


What,  is  the  ABC  0  E 

DIR.  COR.  6  9  S  2  a 

CRO.  SPD.  93  98  67  65  90 


FIGURE  16.10 

SAMPLE  PROBLEMS  OF  TABLE  READING, 
CP62IA - PART  JT 


AIR 

3PEC0 

WiNO 

VELO. 

WINO 

ANGLE 

ICO 

100 

20 

10 

:o 

50 

$97 


* 


but  unscored,  practice  items.  Dial  reading  contains  57  recorded  and 
scored  items. 

The  directions  for  Table  Reading,  part  I,  contain  three  sample  items 
with  the  correct  answers  marked  and  two  recorded,  but  unscored,  prac¬ 
tice  items.  Table  Reading,  part  I,  contains  43  recorded  and  scored  items. 
The  directions  for  table  reading,  part  II,  contain  two  sample  items  with 
the  correct  answers  illustrated  and  two  recorded,  but  unscored,  sample 
items.  Table  Reading,  part  II,  contains  43  recorded  and  scored  items. 

(2)  Administration. — As  previously  mentioned,  CP622A  and  the  two 
parts  of  CP621A  are  all  printed  in  the  same  booklet.  The  3  units  are 
administered  in  succession,  the  time  limits  being  9,  8,  and  7  minutes  re¬ 
spectively.  Answers  for  all  three  units  are  entered  on  a  single  answer 
sheet.  Sample  problems  and  all  necessary  directions  are  printed  in  the 
test  booklet. 

(3)  Scoring. — For  the  5  months  flowing  their  insertion  in  the  bat¬ 
tery,  both  tests  were  scored  by  the  formula  R— W.  During  this  period, 
the  two  tests  were  treated  as  discrete  measures,  separate  scores  being 
secured  for  each.  S'  it'Stical  study  of  these  scores  revealed,  however,  that 
the  two  tests  are  functionally  very  similar,  so  that,  since  that  time,  they 
were  scored  as  one  test,  using  the  formula  (R— W)/2. 

Statistical  results. — Since  this  test  was  included  in  the  classification 
battery,  it  was  extensively  analyzed. 

(1)  Distribution  statistics. — Typical  examples  of  distribution  statis¬ 
tics  are  given  in  table  16.20. 


T able  16.20. —  Distribution  statistics  for  Dial  and  Table  Reading,  CP622A  rnd 
CP621A,  based  uf>on  samples  of  unclassified  aviation  students 


Form 

Psychological 
research  unit  No. 

Testing 

dates 

N 

hi 

SD 

CP622A  . 

t 

October  1942 . 

1,454 

22.1 

7.5 

CI*422A  . 

I 

. Do . 

1,980 

21  7 

7.4 

a*622A  . 

3 

August  and  September  1942  . . . 

U20 

24.0 

7.4 

CP422A  . 

3 

March  1941 . 

392 

25  3 

7.2 

CP62IA  . 

i 

October  1942  . 

1.455 

39.4 

13.8 

CP62IA  . 

1 

_ ....Do . 

1  991 

31  j 

13.5 

CP62IA  . 

2 

Aucust  and  September  1942  . . . 

l.SZO 

39.7 

14.1 

IP42IA  . 

2 

October  1942  . 

1  374 

39  4 

14.4 

CP42IA  . 

J 

March  1943  . .  . 

392 

43  1 

12.2 

CT62IA  and  CP622A1 

i 

December  1942  . 

1.094 

32.0 

9.1 

Cl*6JI A  and  CP622A 

2 

. Do . 

1 .01 5 

34  0 

100 

CP42IA  and  CP622A 

1 

. Do . 

1.143 

33.7 

9.0 

CP62IA  and  CP622A 

1.  2.  3 

July  1943  . 

3.000 

33.9 

U 

CP42IA  and  CP62M 

1.  2.  3 

November  1943  . 

1.500 

34.5 

19 

CP42IA  and  CP622A 

'4-10 

. Do . . 

1,920 

32.4 

9.1 

1  Medical  and  Psychological  Examining  Unit  a. 


(2)  Internal  consistency. — The  degree  of  homogeneity  of  the  items, 
of  all  three  parts  combined,  is  indicated  by  a  mean  internal-consistency 
phi  of  0.20,  a  standard  deviation  of  the  phi  distribution  of  0.07,  and  a 
range  of  values  from  0.04  to  0.42.  These  statistics  are  based  upon  the 
responses  of  the  highest  25  percent  and  the  lowest  25  percent  in  total 
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score  of  a  group  of  800  unclassified  aviation  students  tested  in  October 
1943  at  Psychological  Research  Unit  Mo.  3. 

(3)  Reliability  coefficient.-  -Five  samples  yielded  die  estimates  of  reli¬ 
ability  given  in  table  16.21. 


Table  16.21. —  Estimated  reliability  coefficients  of  Dial  Reading  Test,  CP 622 A,  and 

Table  Reading  Test,  CP621A 


Form 

Croup 

Typ. 

N 

r« 

CP622A  . 

Unclassified  aviation  student.'* 
. Do*  . 

1.167 

1.167 
1,167 

712 

t.OOO 

0.62 

.71 

.77 

"ii 

0.76 

.04 

.87 

.82 

.90 

CPo.'lA  . 

Cl*62l  A  and  622A 
CP621A  and  622A 
CP62IA  and  622A 

. Do*  . 

. Do  . 

. Do*  . 

1  Texted  at  Medical  and  Pathological  Examining  Unit  No.  *.  Toting  dato  not  reported;  dot* 
reported  May  194$. 

*  Adminittered  tor  experimental  purpoxa  in  two  separately  timed  Salves. 

*  28-day  interval  between  testing.  Toted  at  Medical  and  rsycko  log  seal  Examining  Unit  No.  6 
from  II  to  IS.  April  IW5, 

*  Tested  at  Medina!  and  Psychological  Examining  Unit  No.  7,  April  1944. 

(4)  Diffictdty. — Based  upon  the  responses  of  the  above-mentioned 
sample  of  800  unclassified  aviation  students,  the  three  parts  of  the  test 
yielded  a  mean  proportion  of  correct  responses  of  0.85,  corrected  for 
chance,  with  a  range  from  0.25  to  0.99  and  a  standard  deviation  of  0.09. 

(5)  Factoral  composition. — For  all  parrs  combined  the  most  signifi¬ 
cant  loadings  are  in  the  numerical  (0.53),  space  I  (0.42),  anti  perceptual- 
speed  (0.31)  factors.  The  communality  is  0.65.  For  a  full  picture  of  the 
factorial  composition  of  this  test,  sec  appendix  B. 

(6)  Test  validity. — Validation  results  arc  given  in  tables  16.22  to 
16.25  inclusive. 

(7)  Item  validiiy. — Validation  of  items  of  this  test  disclosed  the  re¬ 
sults  recorded  in  table  16.26. 

Evaluation. — As  indicated  by  the  code  number,  the  test  was  primarily 
thought  to  be  a  test  of  perception.  On  a  priori  grot  ids  this  assumption 
was  entirely  reasonable.  One  would  certainly  expect  that  rapidly  looking 
through  tables  anti  inspecting  dials  would  draw  markedly  upon  perceptual 
ability.  In  part,  this  did  prove  to  be  live  case,  for  1 1  percent  of  the  total 
variance  of  the  test  is  accounted  for  by  the  pcrceptual-spccd  factor.  In 
addition,  however,  two  other  factors  show  even  greater  saturation  in 
the  test.  Twenty-eight  pc. cent  of  the  total  variance  is  numerical  and  19 
percent  is  in  spatial  relations.  The  communality  represents  65  percent 
of  the  total  variance.  Thus,  reading  a  set  of  dials  and  tables  involves 
more  than  might  be  expected  at  first  glance.  Apparently,  numerical  abil¬ 
ity  is  required  in  the  implicit  additions  and  subtraction  while  interpolat¬ 
ing  dial  readings.  In  table  reading,  the  use  of  positive  ami  negative  num¬ 
bers  and  of  quadrants  of  the  table  is  no  doubt  a  contributing  influence 
to  the  numerical  loading. 
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’I’rcKluct-mometit  correlation. 

*  Including  Medical  and  Psychological  Examining  Units. 


Table  16.26. —  Validity  of  item  of  Dial.  Reading,  CP622A,  end  Table  Reading, 
CP621A,  using  the  graduation-elimination  criterion 


Croup 

Range  ef  • 

N 

M« 

SD0 

Low 

High 

Pilots  in  primary  training1 

1,400 

745 

\(a.A _ A 

0 .57 

0.02 

0  0J 

-0.07 

0.14 

Navigator**  . 

1  Tm  411  To.t.4 

.81 

.04 

•OS 

-.07 

.U 

1  *n  c-*55  Tested  in  March  and  April  1943  at  Psychological  Research  Unit  No.  jf. 

*  Class  not  identified.  Tested  in  April  and  May  1944  at  Psychological  Research  Unit  Ms.  ), 


The  Dial  and  Table  Reading  test  attained  its  highest  validity  in  predict¬ 
ing  navigator  success.  This  it  did  unusually  well,  the  corrected  coefficient 
for  the  sample  of  nearly  2,000  cases  being  0.54.  For  success  in  primary 
pilot  training,  the  test's  predictive  power  was  much  more  moderate.  The 
coefficients  range  mostly  from  0.20  to  0.28. 

Of  all  the  tests  in  the  classification  battery,  Dial  and  Table  Reading  has 
the  distinction  of  being  the  best  single  predictor  of  success  for  any  air¬ 
crew  specialty.  It  predicts  navigator  success  with  a  validity  of  approxi¬ 
mately  0.54.  The  explannti  no  doubt,  lies  in  the  fact  that  two  im¬ 
portant  fundamental  navigator  abilities  are  measured  by  the  test  to  a 
substantial  degree.  These  are  the  nt  merical  and  the  spatial-relations  fac¬ 
tors.  Speed  of  perception  is  also  sampled,  and  it  contributes  a  small  but 
appreciable  amount  to  the  navig.it oi  validity.  These  three  factors  account 
for  about  80  percent  of  the  obtained  validity,  and  small  loadings  on  rea¬ 
son  ing  I,  mathematical  background,  verbal,  and  other  factors  very  nearly 
account  for  the  remaining  20  percent.  The  pilot  validity  is  likewise  fully 
accounted  for  by  familiar  factors. 

It  should  be  pointed  out,  however,  that  the  three  leading  factors  con¬ 
tained  in  Dial  Reading  and  Tabic  Reading  are  l letter  measured  by  other 
battery  tests.  In  spite  of  the  high  navigator  validity,  therefore,  it  cannot 
be  considered  that  the  prediction  of  navigators  is  the  greatest  value  of 
the  test,  for  that  can  be  done  on  the  basis  of  other  tests.  Dial  and  Tabic 
Reading  did  aid  in  the  discovery  and  definition  of  these  factors,  however, 
and  served  to  give  secondary  and  supplementary  coverage  of  them  in  the 
classification  battery. 

Paratroop  Dropping  Test,  CI209A  T 

This  test  was  designed  to  measure  table  reading  and  computational 
ability  in  a  meaningful  situation. 

Description. — F.ach  item  of  the  test  requires  the  examinee  to  select 
from  five  diagrammatic  sketches  the  one  that  portrays  a  situation  in 
which  a  paratrooper  would  land  directly  on  his  objective. 

Each  of  the  diagrammatic  sketches  shows  an  airplane  in  Might,  having 
passed  over  and  beyond  its  objective,  an  enemy  city.  The  distance  be¬ 
yond  the  objective  is  given.  The  task  of  the  examinee  is  to  determine 
whether  or  not  the  given  distance  is  equal  to  the  distance  that  the 


'Developed  at  Medical  and  Psychological  Examining  Unit  No.  10.  Chief  coniriltuler:  Cap*. 
Joseph  E.  King. 


parachutist  would  land  (considering  plane  height,  air  speed,  wind  speed, 
and  diameter  of  the  parachute)  behind  the  airplane.  If  the  two  distances 
are  equal,  the  examinee  concludes  that  the  paratrooper  would  land  di¬ 
rectly  upon  his  objective. 

(1)  Internal  characteristics. — The  directions  contain  three  recorded, 
but  unscored,  sample  items.  The  test  contains  22  scored  items. 

(2)  Administration. — Ten  minutes  are  allowed  tor  the  directions  and 
3C  minutes  for  the  test  problems. 

Following  arc  parts  of  the  directions : 

A  pilot  is  assigned  the  task  of  dropping  paratroops  upon  an  enemy  town.  In 
dropping  the  chutists  he  must  consider  four  factors;  (1)  The  height  of  the  airplane 
from  tJ»e  ground,  (2)  the  diameter  of  the  parachute.  (3)  the  speed  of  the  wind 
against  which  he  is  flying,  and  (4)  the  speed  of  the  airplane. 

The  effect  of  each  factor  is  stated  in  terms  of  the  feet  behind  the  airplane  that 
the  chutist  will  land.  Two  or  more  factors  may  operate  at  the  same  time,  as,  for 
example,  speed  of  the  wind  and  diameter  of  the  parachute.  When  this  happens,  the 
chutist  lands  a  distance  behind  the  plane  which  is  the  sum  of  the  two  or  more  dis¬ 
tance*. 

Distances  behind  the  airplane  that  the  chu'ist  will  land,  under  the  vari¬ 
ous  conditions  imposed,  arc  given  in  tables  accompanying  the  problems. 

<3)  Scoring. — The  scoring  formula  used  is  R—W/4. 

Statistical  results. — The  limited  data  available  are  Tbr  examinees 
tested  at  Medical  and  Psychological  Examining  Unit  No.  10. 

(1)  Distribution  statistics.-  -A  sample  of  400  unclassified  aviation 
students  yielded  a  mean  score  of  14.9  and  a  standard  deviation  of  5.1. 

(2)  Internal  consistency.  —The  degree  of  homogeneity  of  the  items  is 
indicated  by  a  median  internal-conc'Stency  phi  of  0.60. 

(3)  Reliability  coefficient. — By  the  odd-even  method,  an  estimated 
reliability  coefficient  of  073,  uncorrected,  was  obtained.  This  figure  is 
based  on  a  sample  of  400  unclassified  aviation  students. 

Staluvtion.—  Correlations  with  battery  tests  indicated  that  the  Para- 
troop  Dropping  test  was  not  measuring  any  factors  not  already  covered 
by  other  tests,  so  validation  was  not  recommended. 

Marking  Accuracy  (no  code  number)* 

This  test  was  constructed  as  pa  t  of  an  experiment  designed  to  deter¬ 
mine  whether  test  validities  and  other  correlations  were  affected  by 
speed  in  marking  answer  sheets.  All  printed  tests  are  machine  scored, 
answers  being  recorded  on  a  separate  answer  sheet,  i'his  operation  is  a 
clerical  function  which,  *n  speed  tests,  may  influence  the  score,  to  a 

•  j|  P»jr,  holcxicn  RnMift  Unil  No.  J.  Chief  contributor!:  It.  Kr»nV  Dudck 
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practical  extent.  If  the  influence  of  any  Irrelevant  function  proves  to  be 
real,  it  can  be  subtracted,  permitting  the  intended  nature  of  each  test  to 
be  more  prominent. 

Description. — The  test  material  consists  of  IBM  answer  sheets  on 
which  the  letters  designating  answer  positions  are  circled  by  overprint¬ 
ing.  The  examinee’s  task  is  merely  to  blacken  the  space  under  the  circled 
letters. 

(1)  Internal  characteristics. — The  test  is  divided  into  2  parts,  each 
containing  75  items. 

(2)  Administration. — The  directions  arc  delivered  orally.  Forty  sec¬ 
onds  arc  allowed  to  mark  the  items  in  each  part.  The  total  testing  time 
is  approximately  three  minutes.  Following  is  |>art  of  the  orally-adminis¬ 
tered  directions: 

We  are  interested  in  determining  how  much  of  a  dexterity  factor  there  is  in  a 
papcr-and-pcncil  test.  For  this  reason,  we  arc  asking  you  merely  to  fill  out  an  answer 
sheet  as  rapidly  as  you  can. 

You  have  an  answer  sheet  on  which  small  circles  indicate  the  spaces  to  be  black¬ 
ened.  It  is  important  that  the  correct  spaces  be  blackened  and  that  they  be  blackened 
adequately  enough  to  be  scored  on  an  electric  scoring  machine. 

(3)  Scoring. — The  score  is  simply  the  number  of  the  last  marked  item. 

Statistical  results. — The  available  data  are  based  upon  examinees 
tested  at  Psychological  Research  Unit  No.  3. 

(1)  Distribution  statistics. — A  sample  of  284  classified  pilots  tested 
in  March  and  April  1942  and  August  and  September  1943  yielded  a 
mean  score  of  83.6  and  a  standard  deviation  of  8.0.  The  distribution 
curve  is  slightly  positively  skewed  and  somewhat  flatter  than  normal. 

(2)  Reliability  coefficient. — By  the  alternate- forms  (part  I-part  II) 
method,  an  estimated  reliability  coefficient  of  0.86,  corrected  for  length, 
was  obtained.  This  figure  is  based  on  the  above-mentioned  sample  of 
284  classified  pilots. 

(3)  Factorial  composition. — The  only  substantial  loadings  are  in  the 
psychomotor  III  (0,50)  and  jierceptual-ipecd  (0.35)  factors.  The  com¬ 
monality  is  0.41.  For  a  fuller  picture  of  the  factorial  composition  of  thi« 
test,  sec  appendix  B. 

Evaluation. — Marking  accuracy,  like  Log  Book  Accuracy  (described 
below),  was  factor-analyzed  in  the  Integration  Battery  (see  ch.  10). 
Marking  Accuracy  differs  from  I.og  Book  Accuracy  only  in  its  secondary 
loadings.  Both  tests  are  highly  saturated  with  the  psychomotor- speed 
factor.  The  Marking  Accuracy  test  shows  some  perceptual-speed  vari¬ 
ance  (loading  of  0.35),  in  contrast  to  an  equivalent  loading  oft  the  nu¬ 
merical  factor  for  Log  Book  Accuracy.  Apjarcittly  the  task  of  locating  the 
reference  circles  on  the  answer  sheet  demands  a  substantial  amount  of 
pcrceptual-spccd  ability. 
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Log  Book  Accuracy  (no  code  number)* 

This  test  was  also  designed  for  the  purpose  of  determining  the  part 
that  speed  of  marking  an  answer  sheet  might  play  in  many  printed  tests. 

Description. — In  the  test  booklet  are  listed  stem  numbers  followed  by 
the  answers  A,  B,  C,  D,  or  E  to  be  marked.  The  item  numbers  are  in 
random  order  rather  than  in  sequence.  The  examinee’s  task  is  to  record 
the  answers  quickly  and  correctly  on  a  separate  answer  sheet. 

(1)  Internal  characteristics. — The  directions  contain  an  illustration  of 
five  correctly  marked  items.  Parts  I  and  II  each  contain  75  recorded  and 
scored  items. 

(2)  Administration. — Four  minutes  are  allowed  for  each  part,  while 
directions  consume  about  3  minutes,  making  a  total  testing  time  of  ap¬ 
proximately  11  minutes. 

Following  arc  parts  of  the  directions  and  sample  items: 

Opposite  each  of  the  item  numbers  in  the  test  booklet  «.  a  letter.  Your  only  task  is 
to  blacken  thyApacc  on  your  answer  sheet  which  corresponds  to  the  item  number  and 
letter  in  the  booklet. 

Look  at  the  five  sample  items: 

1.  C 

4.  A. 

JL  D. 

2.  E. 

5.  a 

If  you  were  marking  your  answer  sheet,  you  would  blacken  space  C  opposite  item 
I ;  space  A  opposite  item  4;  and  so  on. 

(3)  Scoring. — The  scoring  formula  is  R— W. 

Statistical  results,  (i)  Distribution  statistics. — A  sample  of  278  clas¬ 
sified  pilots  tested  at  Psychological  Research  Unit  No.  3  in  August  and 
SepfcrilxT  1913  yielded  a  mean  score  of  66.2,  and  a  standard  deviation 
of  11.0.  The  distribution  curve  is  approximately  symmetrical  and  con¬ 
siderably  flatter  than  normal. 

(2)  Reliability  coefficient. — By  the  alternate- forms  (part  I-part  II) 
method,  an  estimated  reliability  coefficient  of  0.75,  corrected  for  length, 
was  obtained.  This  figure  Is  based  on  the  above-mentioned  sample  of 
278  classified  pilots. 

(3)  Factorial  composition. — The  only  substantial  loadings  arc  in  the 
psychomotor  III  (0.60)  and  numerical  (0.32)  factors.  The  commonality 
is  0.56.  For  a  fuller  picture  of  the  factorial  composition  of  this  test,  see 
appendix  B. 

Ex'oluation. — The  uiily  available  data  on  I.og  Book  Accuracy  are  in  con¬ 
nection  with  the  factor  analysis  of  integration  tests  (see  eh.  10).  The 

•  at  P»jrrholo,,ical  HiKifck  Unit  N».  ),  CVrf  tMlribultii:  Cist.  Stuart  W.  C'uk 
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two  marking  tart;.,  I  Mg  Hook  Accuracy  with  a  loading  of  0.00  ant!  Mark¬ 
in  Act urary  witlt  a  loading  of  0.50,  defined  a  new  tac;»r  hitherto  un¬ 
known.  Whetlicr  the  factor  is  broader  than  speed  of  marking  cannot  be 
answered  front  the  aviilrblc  data.  A  loading  of  0.32  for  Log  Book  Accu¬ 
racy  on  the  numerical  factor  indicates  that  the  task  of  locating  answer- 
sheet  numbers  quickly  when  they  are  out  of  sequence  involves  numerical 
facility.  Some  spatial  ability  is  jtossibly  involved  here  also,  hut  the  loading 
in  the  spatial-relations  factor  (0.19)  is  ltardlv  .significant  enouglt  to  sup¬ 
port  adequately  this  assumption. 

The  principal  value  of  Log  Hook  Accuracy  and  Marking  Accuracy  was 
to  show  that  if  marking  speed  affects  the  scores  on  the  ntajority  of 
machine-scored  tests,  it  does  so  only  slightly.  In  no  test  included  in  the 
integration  analysis,  many  of  which  have  complicated  marking  directions, 
did  marking  speed  constitute  a  serious  source  of  extraneous  variance, 
for  in  none  is  there  an  appreciable  loading  with  this  factor. 

Evaluation  of  Clerical  Speed  Test* 

This  group  of  tests  was  shown  to  measure  abilities  tint  a  e  important 
to  air-crew  members,  especially  the  navigator,  in  achieving  success  in 
training.  The  best  representative  measure  of  the  group.  Dial  and  Table 
Reading,  has  occupied  a  permanent  (dace  in  the  class!  heat  ion  battery  since 
the  early  months  of  testing. 

The  principal  features  of  these  tests  arc  their  numerical  and  pcrceptual- 
s|>ced  content.  In  some  of  these  tests  spatial-relations  factorial  content 
is  also  present.  The  numerical  factor  is  the  most  valid  factor  for  naviga¬ 
tors  yet  measured,  while  perceptual  speed  and  spatial  relations  are  valid 
for  all  three  air-crew  specialties. 

Aiilmugh  there  was  no  Sjxvinl  intent  by  the  AAF  to  analyze  (tie  nature 
of  clerical  tasks  as  such,  the  similarities  of  the  problems  presented  by 
this  group  of  tests  ofTers  an  opportunity  to  speculate  regarding  clerical 
activities  in  general.  Since  they  seem  to  be  primarily  perceptual  and 
numerical,  a  test  battery  composed  of  jwirc  measures  of  these  two  factors 
weighted  properly  might  be  adequate  for  measuring  aptitude  for  many 
types  of  clerical  work.  For  both  factors,  relatively  t»ure  measures  are 
available.  There  is  some  reason  to  believe  that  two  other  factors  nuy 
also  enter  into  aptitude  for  certain  types  of  clerical  work.  These  are 
psycliomotur  sjxtm!  and  spatial  relations.  In  Log  Hook  Accuracy  we  have 
a  fair  measure  of  psy.homotov  speed,  but  we  need  more  data  upon  which 
to  base  an  interpretation  of  the  factor  before  we  can  safely  prescribe  its 
use.  For  spatial  relations,  no  iinkpcndent  measures  have  yet  been  dis¬ 
covered,  although  there  are  several  tests  with  significant  saturations  in 
if.  Further  discussions  of  this  trait  will  be  found  in  later  chapters. 

EVALUATION  OF  PKKOiPTUAL  SPEED  TESTS 

Factor  analysis  of  experimental  and  classification  tests  «lcvcl**ped  for 
use  in  the  selection  of  air-errw  members  has,  so  far,  revealed  only  one 
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factor  to  which  the  name  perceptual  has  been  applied.  This  factor  has 
been  described  as  the  ability  to  discriminate  rapidly  visual  differences  in 
form  ?nd  detail  'l  ire  factor  is  best  defined  by  bjKxd  of  Identification,  a 
test  similar  to  Th’.usumes  Identical  Forms  test  which,  in  his  original 
analysis  of  56  variables,  best  defined  the  factor  P 

Tests  that  reveal  relatively  pure  or  independent  factors  were  of  par¬ 
ticular  value  to  aviation  psychologists.  They  were  very  useful,  for  ex¬ 
ample,  in  factor-analysis  studies.  In  setting  up  a  correlational  matrix  to 
be  analyzed,  factor  analysts  recognize  that  the  inclusion  of  such  tests 
helps  to  simplify  the  rotational  procedure,  particularly  in  determining 
the  initial  directions  of  rotation.  Ixsscr  known  factors  can  then  be  re¬ 
vealed  more  quickly  and  clearly.  The  greatest  value  of  pure  tests  is  their 
adaptability  in  selection  testing.  If  a  job  is  evaluated  in  terms  of  factors, 
it  is  necessary  only  to  procure  an  independent  measure  of  each  factor 
involved  and  to  weight  each  measure  properly  in  the  final  evaluation. 

The  fact  that  perceptual  speed  is  involved  to  a  greater  or  lesser  ex¬ 
tent  in  such  a  variety  of  jolts  and  occupations  insures  a  permanent  use 
for  a  pure  measure  of  the  factor.  Tests  such  as  Speed  of  Identification 
and  Identical  Forms  arc  thus  of  considerable  value  because  they  arc  among 
the  purest  measures  of  single  known  factors. 


FACTOR  ANALYSES  OF  PERCEPTUAL  TESTS 

The  Data 


Two  factor  analyses  were  made  of  perceptual  tests.  A  hattcry  of  31 
tests  was  administered  to  392  unclassified  aviation  students.  The  battery 
consisted  of  a  number  of  tests  designed  to  measure  different  aspects  of 
|H-rccption  plus  selected  classification-tattcry  tests  of  known  factorial 
content.  For  the  purposes  of  analysis,  the  battery  was  divided  into  two 
smaller  groups,  18  tests  in  the  first  matrix  ami  22  in  the  second,  with 
12  tests  common  to  both.  While  there  are  disadvantages  in  making  two 
snta!k*r  analyses  in  place  of  a  single  large  one,  it  was  lielievcd  that  the 
greater  ease  of  c  ’.putatiou  and  simplicity  of  rotation  would  make  the 
former  method  more  profitable.  The  correlation  matrices  arc  given  in 
taldrs  16.27  and  16.28.‘* 

All  the  tests  included  in  these  analyses  are  described  elsewhere  in  this 
volume,  with  the  exception  of  the  four  that  are  briefly  descrilxsl  in  the 
following  jtaragnifihs.  Full  descriptions  of  these  tests  can  be  found  in 
reports  6  and  7  of  the  AAF  psychological  series. 

.V/svJ  hslhmtion  II  Ith'iiiifiraftoM  of  IV lort'iss .  Cl'205R — //. — This 
is  a  mntion-pieture  test  presenting  a  model  plane  against  a  moving  back¬ 
ground  (clouds).  The  airjdane  sliown  in  tin*  middle  of  the  screen  thus 
appears  to  Ur  moving.  The  examinee  is  taught  to  recognize  five  different 
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velocities  before  the  testing  period  begins.  The  five  velocities  arc  then 
presented  in  random  order  for  identification. 

Plane  Formation ,  CP805B — This  is  a  motion-picture  test  using  the 
tachistoscopic  method  of  presenting  material  to  be  remembered.  Each 
presentation  shows  a  grid  of  25  squares  upon  which  appear  5  plane  sil¬ 
houettes.  The  task  of  the  examinee  is  to  observe  the  screen  and,  imme¬ 
diately  after  the  exposure,  to  fill  in  spaces  on  the  answer  sheet  corre¬ 
sponding  to  the  sections  of  the  grid  that  included  the  planes. 

Gottsehaldt  Figures,  QP901A,  Part  III. — In  this  test  the  examinee’s 
task  is  to  determine  which  one  of  five  simple  geometric  figures  is  con¬ 
tained  in  different  complex  geometrical  figures. 

Aerial  Photographs,  QP901A,  Part  IV. — Tins  test  presents  a  scries 
of  oblique  aerial  photographs  upon  which  a  number  of  alphabetically 
lettered  points  are  placed.  The  examinee’s  task  is  to  answer  questions 
concerning  various  distance  relationships  between  the  points. 

The  Factors 

Six  factors  were  obtained  in  the  first  analysis.  The  same  six  factors 
plus  two  others  and  one  residual  factor  were  obtained  in  th-  second.  In 
the  following  text,  factor  loadings  in  the  two  analyses  arc  treated  to¬ 
gether.  Loadings  arc  reported  if  a  test  appears  with  a  saturation  of  0.25 
or  greater  in  either  study.  The  rotated  factors  arc  numbered  to  corre¬ 
spond  in  the  two  analyses.  The  centroid  loadings  arc  shown  in  tables 
16.29  and  16.30,  and  the  rotated  factor  loadings  in  tables  16.31  and 
16.32. 


Table  1629.— Centroid  factor  loadings  of  Pereettuat  Battery  P 
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Rotated  factor  I  is  defined  by  the  following  data: 


Test  No. 

Tot  name 

Loading* 

_  I  ! 

II 

17 

1 

3 

,  ,  ,  .  O 

oli 

23 

2 

** 

22 

.43 

.20 

.26 

.21 

.19 

.10 

26 

24 

GottschaMt  Figures,  QP90IA,  Pari  ill  . 

.41 

•JO 

.1* 

3 

27 

5 

10 

'.56 

.25 

21 

.11 

.11 

18 

.27 

.21 

8 

4 

.2? 

This  factor  is  clearly  the  perceptual-speed  factor,  usually  defined  by 
the  Speed  of  Identification  test.  In  the  first  analysis,  probably  because  of 
the  presence  of  two  forms  of  the  same  test,  the  loading  for  Speed  of 
Identification  is  higher  than  usual.  In  the  second  analysis,  the  loading  is 
somewhat  below  the  mean.  It  is  possible  that  in  the  first  analysis  there 
is  specific  nonerror  variance  of  the  test  included  in  the  factor  loading. 
One  noteworthy  discrepancy  between  the  loadings  in  these  two  analyses 
is  for  Plane  Formation.  In  the  second  analysis  the  rotations  reduced  the 
loading  on  this  factor  and  increased  the  loading  on  the  visual-memory 
factor  in  which  it  is  more  heavily  saturated. 

Rotated  factor  II  is  defined  by  the  following  data : 


Test  No. 

Loadings 

I 

II 

7 

0.S0 

.47 

0.S4 

U 

.42 

26 

.12 

.10 

12 

.12 

10 

18 

.26 

.26 

8 

.26 

.19 

IS 

.12 

.26 

Because  this  factor  is  best  defined  by  tests  that  require  the  manipula¬ 
tion  of  visual  images,  it  has  been  termed  visualization.  Other  tests  that 
measure  the  factor  arc  Pattern  Comprehension  and  Spatial  Visualization 
I  and  II. 

Rotated  factor  III  is  defined  by  the  following  data: 


Test  No. 

Test  nsme 

Loadings 

I 

II 

31 

Plane  Formation  . . . . 

0.44 

mm 

IS 

Directional  Orientation  . 

.12 

22 

Hlock  Counting  . 

•  to 

K] 

24 

Pattern  Analysis  . . . 

.11 

.IS 

18 

Picture  Integration . 

•  •  • 

.28 
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This  factor  is  not  well  known.  It  has  been  suggested  that  pattern  per¬ 
ception  might  be  the  fundamental  element  involved.  It  is  possible,  how- 
ex  er,  that  a  memorial  component  is  the  single  feature  that  these  tests 
nave  in  common.  Similarities  to  new  factors  obtained  in  other  analyses 
give  some  support  for  this  conclusion.  A  clear-cut  visual-memory  factor 
was  isolated  in  the  analyses  of  the  Memory  Batteries  (see  eh.  11).  The 
Map  Memory  tests  have  substantial  loadings  on  the  factor.  Although 
further  work  will  he  necessary  before  these  factors  can  be  identified 
with  each  other  with  complete  assurance,  the  factor  will  tentatively  be 
labeled  “visual  memory." 


Rotated  factor  IV  is  described  by  the  following  data : 


Test  No. 

Test  name 

Loadings 

I 

II 

6 

0  60 

0.45 

s 

.54 

.50 

4 

.50 

26 

.20 

.29 

.28 

27 

.27 

2S 

26 

24 

.14 

.26 

This  is  the  familiar  numerical  factor,  best  defined  by  the  Numerical 
Operations  test.  The  absence  of  other  number  tests  from  the  cc  ."relational 
matrix  made  the  differentiation  between  number  and  reasoning  difficult. 
The  loaumgs  of  the  two  classification  tests  (Mathematics  B,  and  Dial 
and  Table  Reading)  are  congruent  with  previous  results,  but  it  is  diffi¬ 
cult  to  rationalize  the  correlations  of  the  thret  experimental  tests  with 
this  factor  in  the  second  analysis.  In  the  first  study  the  experimental 
tests  are  less  numerical  but  have  higher  reasoning  loadings. 


Rotated  factor  V  is  defined  by  the  following  data : 


Test  No. 

Test  name 

Loadings 

I 

II 

6 

Mathematics  11  . . . 

0.49 

0.55 

•  8 

Picture  Integration  .  . . 

1 

.42 

20 

<  'llbf<  . . . . . . . 

.41 

15 

Directional  Orientation  . 

.31 

25 

Outlet  liaMt  Figure*  . 

.21 

.27 

21 

Flair*.  Figures,  Card* . . . 

.26 

.16 

7 

Mechanical  Principle* . . 

.25 

.20 

4 

Dial  Reading  . . .  . . 

.24 

•  •  • 

This  is  the  general -reasoning  (reasoning  I)  factor.  It  was  not  clear 
at  the  time  the  factor  was  first  isolated  whether  or  not  the  modifying 
term  “inductive"  should  be  used  in  connection  with  its  description. 
Many  tests,  clearly  inductive  in  character,  do  have  high  loadings  on  the 
factor.  It  is  also  probable  that  at  least  two  other  reasoning  factors  have 
been  isolated  (sec  eh.  7). 


Rotated  factor  VI  is  defined  by  the  following  data: 


Test  No. 

Test  name 

leadings 

1 

II 

8 

Complex  Coordination  . 

0.52 

0.50 

21 

Flags,  Figures,  Cards . 

.49 

.4J 

20 

Cube*  . . . . . . 

.42 

.47 

15 

Directional  Orientation  . 

.47 

.41 

5 

Table  Reading  . . 

.42 

.35 

4 

Dial  Reading  . . . 

.41 

»  •« 

18 

Picture  Integration  . 

•  •  • 

.40 

2 

Spatial  Orientation  II  . 

.40 

•  •  • 

10 

Path  Distance . . 

.34 

29 

Sneed  Estimation  II . . . . . 

.33 

7 

Mechanical  Principles . 

.31 

.25 

22 

Block  Counting  . . . . 

.28 

23 

Path  Tracing . 

.26 

24 

Pattern  Analysis  . . . . 

.26 

.IS 

This  factor  has  been  termed  spatial  relations.  It  was  surprising  to 
find,  in  the  second  analysis,  that  Tlmrstone's  Hands  test  appeared  pro¬ 
jected  on  a  different  factor  (factor  VII).  Apparently  two  spatial  fac¬ 
tors  arc  involved;  one  is  best  defined  by  Complex  Coordination,  the 
other  by  the  Hands  test.  Flags  and  Cubes  appear  on  both.  Other  tests 
known  to  be  loaded  with  the  original  spatial-relations  factor  arc  Instru¬ 
ment  Comprehension  II,  Two-Hand  Coordination,  and  Discrimination 
Reaction  Time. 


Rotated  factor  VII  is  defined  by  the  following  data  from  the  second 
analysis  only: 


Test  No. 

Test  name 

Loading  II 

19 

Hands  . . . 

0.48 

21 

Flags,  etc.  . 

.42 

20 

Cube*  . 

.25 

This  is  the  factor  just  discussed.  Since  a  rotation  or  a  positional 
change  seems  to  be  involved,  the  factor  has  been  tentatively  described  as 
"rotational”  or  "positional  space.”  Another  hypothesis  is  that  the  common 
element  enabling  the  subjects  to  solve  problems  readily  is  the  ability  to 
enter  the  self  into  the  action ;  that  is,  by  cmpathctic  involvement,  which 
would  call  for  the  name  "spatial  empathy.”  In  order  not  to  prejudge 
the  factor,  however,  it  is  designated  as  space  II. 

Rotated  factor  VIII  is  defined  by  the  following  tests  from  the  second 
analysis  only: 


Test  No. 

Test  nam* 

Loading*  II 

11 

Line  Length  . . . . . . . . 

a  34 

9 

Point  Distance . . . 

.39 

13 

Map  Distance . . . 

Jt 

29 

Speed  Estimation  II  . . . . 

.29 

14 

Path  Length  . . . . . . . 

M 

This  factor  has  a  precedent  in  the  one  called  length  estimation  found 
in  the  analysis  of  the  Mechanical  Battery  (sec  eh.  13).  The  factor  is  com¬ 
mon  to  the  Path  Distance  test  and  the  Pattern  Assembly  test,  as  well  as 
those  appearing  here. 

Rotated  factor  IX  is  a  residual  factor. 

Conclusions 

The  most  interesting  areas  for  further  test  construction  arc  the  ones 
defined  by  rotated  factor  III  (visual  memory)  and  by  rotated  factor 
VII  (Space  II).  Discussion  of  the  visual-memory  factor  may  lie  found 
in  chapter  11  in  the  report  of  the  factor  analysis  of  memory  tests  and 
in  chapter  12  in  the  evaluation  of  visualization  tests.  Space  II  is  men¬ 
tioned  in  chapter  19  in  the  discussion  of  evaluation  of  space  tests. 

It  is' of  general  interest  to  find  that  out  of  all  the  variety  of  perceptual 
tests  analyzed,  only  two  clearly  perceptual  factors — perceptual  speed  and 
length  estimation — emerged.  A  possible  third  is  a  pattern-perception 
factor,  but  that  hypothesis  was  rejected  in  favor  of  calling  it  visual 
memory.  None  of  the  new  factors  reported  by  Thors  tone  (3)  was 
brought  out.  In  a  number  of  the  tests,  the  nonerror  variances  arc  not 
fully  accounted  for,  however,  so  that  in  more  favorable  batteries  some 
of  Thurstone's  new  factors  may  yet  come  to  light  in  these  tests. 
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CHAPTER  SEVENTEEN 


Farm  Perception  Tests1 


INTRODUCTION 

In  this  chapter  arc  discussed  tests  of  form  perception  under  the  fol¬ 
lowing  a  priori  rubrics:  Patlcrn-formaPon  tests,  pattern-completion  'csts, 
pattern-analysis  tests,  and  illusions  teds. 

In  the  pattern  formation  tests,  the  examinee  is  required  to  reorganize 
disordered  segments  into  a  coherent  whole.  The  pattern-completion  tests 
present  mutilated  patterns  which  the  examinee  must  recognize.  In  the 
pattern-analysis  tests,  faces,  letters,  nonsense  symbols,  and  words  are 
concealed  in  a  complex,  camouflaging  context.  The  examinee  must  form 
correct  figure-ground  relationships.  The  illusions  tests  present  examinees 
with  familiar  geometric  and  Gestalt  illusions  and  attempt  to  measure 
susceptibility  to  these  illusions. 

With  some  exceptions,  these  tests  were  not  constructed  in  the  Fight  of 
specific  job-analysis  information,  but  rather  because  of  systematic  in¬ 
terests  and  psychologically  guided  speculation. 

PATTERN  FORMATION  TESTS 
Picture  Integration,  CP  1 0-4  A  * 

This  test  was  designed  as  a  measure  of  the  ability  to  visualize  objects 
in  space  and  as  a  test  of  perceptual  integration,  on  the  assumptions  that 
visualization  of  spatial  relationships  and  the  ability  to  perceive  a  scene 
adequately  from  fragmentary  or  distorted  cues  arc  important  for  pilots. 

Description. — Each  test  item  consists  of  a  photograph  which  lias  been 
cut  iuto  four  quarters  and  rearranged  in  a  scrambled  order.  One  of  the 
sample  items  used  in  the  directions  is  shown  in  figure  17.1.  The  task  of 
the  examinee  is  to  visualize  the  correct  order  of  the  disarranged  seg¬ 
ments  and  to  indicate  the  correct  arrangement  on  a  work  sheet.  In  the 
bottom  panel  of  figure  17.1  is  shown  an  answer  box  correctly  completed 
for  the  illustrated  item. 

(1)  Internal  characteristics. — The  directions  include  four  sample 
problems.  There  are  30  test  items,  arranged  6  to  a  page.  The  total  num¬ 
ber  of  scored  responses,  then,  is  120. 

(2)  Administration. — The  directions  to  the  test  require  approximately 
5  minutes.  The  test  items  originally  required  another  20  minutes,  but  it 

•  Wrill«#  Sr  Cift.  John  I  [jttf  Sft.  Sunltj  W.  XirlMui  iitii  T«k/S|t.  G*f*W  It  SSlrWf 
tiling  in  («llat>nt  Ebr  iniKruli  f«r  jfcit 

1  Wpnd  it  Elk*  Prrcrptvil  Unit.  Hr jdquartrra,  AAf  TrainMH  CtMnunt  flitl 

conlf ibutcri:  Ctpt.  Rickard  li.  IlcutmU  and  ital. 
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was  found  that  13  minutes  suffice.  Upon  completion  of  the  test,  the 
examinees  transcribe  their  answers  on  the  work  sheet  to  a  specially 
prepared  IBM  answer  sheet. 

(3)  Scoring. — The  scoring  formula  is  (3R— \V)/4. 

Statistical  results. — The  data  given  below  arc  mainly  for  examinees 
tested  in  April  1042  at  Psychological  Research  Unit  No.  3;  those  who 
went  to  pilot  training  were  in  class  43K.  Exceptions  are  noted  in  the 
appropriate  places. 

0)  Distribution  statistics. — A  sample  of  392  unclassified  aviation 
students  yielded  a  mean  score  of  53.5  and  a  standard  deviation  of  17.3. 

(2)  Internal  consistency. — The  degree  of  homogeneity  of  the  re¬ 
sponses  (there  arc  four  responses  to  an  item)  is  indicated  by  a  mean 
phi  coefficient  of  0.32,  with  a  range  from  0.07  to  0.56,  and  a  standard 
deviation  of  0.08.  These  statistics  are  based  upon  the  responses  of  the 
highest  27  percent  and  the  lowest  27  percent  of  a  group  of  350  unclassi¬ 
fied  aviation  students. 

(3)  Reliability  coefficient. — The  reliability  vas  estimated  by  correlat¬ 
ing  the  score  on  the  first  2  pages  (first  48  responses)  with  the  score  on 
the  last  3  pages  (last  72  responses).  For  experimental  purposes,  these 
artificial  parts  were  timed  separately,  with  an  allowance  of  4%  and  8 
minutes,  respectively.  The  correlation  was  0.63  for  422  pilots.  A  very 
rough  estimate  of  the  reliability  of  the  total  test,  then,  is  0.77. 

(4)  Difficulty. — Based  upon  the  responses  of  350  unclassified  avia¬ 
tion  students,  the  test  yielded  a  mean  proportion  of  correct  responses 
of  0.76,  corrected  for  chance  success,  with  a  range  from  0.47  to  0.92 
and  a  standard  deviation  of  0.12. 

(5)  Factorial  composition. — The  most  prominent  loadings  were  found 
in  the  general-reasoning  (0.42),  spatial-relations  (0.40),  vistud-n-emory 
(0.28),  perceptual  (0.27),  and  visualization  (0.26)  factors.  The  com- 
munaiity  was  0.61.  For  a  fuller  picture  of  the  factorial  composition  of 
the  test,  sec  Appendix  B. 

(6)  Test  X'aUdity. — Validation  results  based  on  overlapping  samples 
are  presented  in  table  17.1. 


T AUt  17.1. —  Validation  data  for  Picture  Integration,  CP104A,  for  filoti  in  primary 
training,  graduation- elimination  Criterion 


M. 

SD, 

•r*u* 

»10* 

0*7 

$4.1$ 

$0  4$ 

IS  « 

0  17 

on 

•l*l 

.*7 

5430 

44.10 

15.lt 

.11 

•«* 

.14 

$4.01 

$0.4$ 

S4.M 

.70 

JB 

4  Altunin*  tn  unttilnclnl  ilvunt  itimlird  livutitn  •I  too. 

*  «•  (Uu  *48. 

*J«  fbim  UR  **C.  Ir-flu  !ri  lb*  Snt 

*  to  <Uii»i  41K,  ma,  4  ill.  »nd  44C  »ul  tw#  uatfiea. 

El 'aluation. — This  test  has  a  satisfactory  reliability  ami  moderate 
validity  for  pilots. 
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The  pilot  validity  estimated  from  factorial  equations  (see  ch.  28)  is 
0.22,  which  is  close  to  the  mean  obtained  validity  of  0.25.  A  moderate 
navigator  validity  probably  would  be  found  for  this  test  also,  because 
of  its  saturations  in  the  general-reasoning,  spatial-relations,  and  percep¬ 
tual-speed  factors. 

Sixty-one  percent  of  the  total  variance  of  the  test  has  been  accounted 
for  by  identified  factors.  This  probably  leaves  a  considerable  amount  of 
undefined  nonerror  variance,  which  may  be  found  to  be  attributable  to 
some  new  perceptual  factor.  The  usefulness  of  the  test,  however,  in  fac¬ 
tor  research,  or  in  a  classification  and  selection  program  that  utilizes 
pure  tests,  is  limited  because  of  its  factorial  complexity.  The  following 
factors  account  for  a  substantial  part  of  the  total  test  variance :  General 
reasoning,  18  percent;  spatial  relations,  16  percent;  visual  memory,  8  per¬ 
cent;  perceptual-speed,  7  percent;  and  visualization,  7  percent.  The  inten¬ 
tion  of  the  test  constructors  to  measure  visualization,  therefore,  was  not 
accomplished;  the  test’s  maximum  loading  is  in  the  general-reasoning 
factor. 

| 

i  Pattern  Assembly,  CP804A  * 

!  This  test  was  originally  developed  for  inclusion  ir:  an  experimental 
battery  of  mechanical-comprehension  tests.  It  is  a  variation  of  the  famil¬ 
iar  paper  form-board  test,  which  has  seen  much  use  in  industrial  psy¬ 
chology  as  a  component  of  mechanical-ability  test  batteries.  Its  correla¬ 
tions  with  other  mechanical-comprehension  tests  in  the  experimental 
mechanical  battery,  however,  indicated  that  it  did  not  have  much  in 
common  with  them.  The  test,  therefore,  was  assigned  a  perceptual  code 
number.  Like  the  Picture  Integration  test,  it  seemed  to  require  visualiza¬ 
tion  and  the  integration  of  the  disordered  parts  of  a  whole. 
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SAMPLE  ITEM  OF  PATTERN  ASSEMBLY. 
CP604A 
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Description. — ’Die  nature  of  the  test  items  is  best  explained  by  refer¬ 
ence  to  figure  17.2,  which  is  one  of  the  sample  items  used  in  the  direc¬ 
tions  to  the  test.  The  examinee  is  required  to  select  that  one  of  the  five 
patterns  lettered  A  to  K  which  shows  exactly  how  the  parts  shown  in 
the  upper  left-hand  corner  would  look  when  fitted  together.  The  correct 
answer  is  B.  It  is  important  to  note  that  one  of  the  four  pieces  must  be 
turned  over  in  order  to  fit  into  pattern  B. 

(1)  Internal  characteristics. — The  test  includes  34  items,  the  first  2 
of  which  arc  unscored  sample  items. 

(2)  Administration. — Five  minutes  are  allotted  to  administration  of 
directions  and  15  minutes  for  answering  the  test  items. 

(3)  Scoring. — The  scoring  formula  is  R—W/4. 

Statistical  res  nits. -The  data  given  are  for  examinees  tested  in  No¬ 
vember  1942  at  Psychological  Research  Unit  No.  3. 

(1)  Distribution  statistics.— A  sample  of  485  unclassified  aviation 
students  yielded  a  mean  score  of  15.1  and  a  standard  deviation  of  5.1. 

(2)  Internal  consistency. — The  degree  of  homogeneity  of  the  items 
is  indicated  by  a  mean  phi  coefficient  of  0.26,  with  a  range  from  0.10  to 
0.54  and  a  standard  deviation  of  0.09.  These  data  arc  based  upon  the 
responses  of  the  upper  50  percent  and  the  lower  50  percent  of  a  group 
of  150  unclassified  aviation  students. 

(3)  Reliability  coefficient. — By  the  alternate-forms  method  (artifi¬ 
cially  separated  halves,  separately  timed),  an  estimated  reliability  coeffi¬ 
cient  of  0.59,  corrected  for  length,  was  obtained.  This  figure  is  based 
on  a  sample  of  202  unclassified  aviation  students. 

(4)  Difficulty. — Based  upon  the  responses  of  150  unclassified  aviation 
students,  the  test  yielded  a  mean  proportion  of  coriect  responses  of  0.64, 
corrected  for  chance  success,  with  a  range  from  0.i2  to  0.94  and  a 
standard  deviation  of  0  26. 

(5)  factorial  composition. — This  test  was  included  in  only  one  factor 
analysis.  Prominent  loadings  were  found  only  in  the  length-estimation 
(0.52)  and  perceptual -speed  (0.31)  factors.  Slight  loadings  were  found 
also  in  the  visualization  (0.16)  and  general-reasoning  (0.14)  factors.  The 
comnnmality  is  0.42.  For  a  fuller  picture  of  the  factorial  composition  of 
this  test,  sec  Appendix  B. 

(6)  Test  validity.- -  Validity  dr?ta  arc  presented  in  table  17.2. 


Ta ku  17.2  —  Validity  data  far  I’cttrm  Ajtrtnlly,  Cl'SOtA  and  CTStNAXl,  far 
fit  at  s  in  fronjry  training,  g  rad’.catian  - cl  inti  rattan  criterion 


Far* 

H 

S(«i 

u 

• 

SD, 

crwA  .... 

•».* 

R  -  U7«  . . 

0*2 

l«M 

IJ.OO 

s  e* 

0.21 

CI'N«A  .. 

H<|T 

K--W/4  .. 

.11 

l>  «* 

14  W 

J  20 

.22 

CFtMAX  1* 

151 

K  . 

.74 

22  :* 

:i  eS 

4  70 

.07 

CI*»0«AXM 

1 U 

1 

jK  -  «  W  ... 

.71 

m 

0.92 

"I 

.14 

*  In  (iui  4 it. 

*  la  "Uh  UK. 

*  A  *t»y  tlfkilf  i-Cttttn  f«a;  »N  Vlf*. 


A  variation:  CPS0IAX1. — This,  the  original  form  of  the  Pattern 
Assembly  test,  is  composed  of  -10  items,  most  of  which  are  identical  with 
those  of  the  revised  form,  CP804A. 

Evaluation. — Of  all  tests  studied  in  the  program,  this  test  has  the 
highest  loading  (0.52,  accounting  for  27  percent  of  the  total  lest  vari¬ 
ance)  in  the  length-estimation  factor.  It,  therefore,  deserves  serious  con¬ 
sideration.  Its  next  highest  loading  (0.31,  accounting  for  10  percent  of 
the  test’s  variance)  is  in  the  perceptual-speed  factor,  and,  probably,  this 
can  be  decreased  by  making  the  details  of  the  test  items  more  easily  dis¬ 
criminate  as  to  form  and  detail.  The  very  low  loadings  on  the  visualiza¬ 
tion  (0.16)  and  general-reasoning  (0.14)  factors  are  encouraging  (to 
those  interested  in  pure  tests).  These  loadings,  indeed,  may  represent 
sampling  deviations  from  zero.  If  they  do  not,  the  visualization  loading, 
at  least,  probably  can  be  decreased  by  not  requiring  that  some  of  the 
segments  be  turned  over  before  they  can  be  fitted  into  the  key  figure. 
Since  only  42  percent  of  the  total  test  variance  is  accounted  for  by  the 
factors  so  far  identified  in  this  test,  however,  there  is  considerable  non¬ 
error  variance  yet  undefined.  There  is  room  for  a  new  factor  with  a 
loading  in  this  test  of  approximately  0.40. 

The  pilot  validity  is  only  moderate,  the  weighted  average  validity  for 
a  total  of  839  cases  being  only  0.18.  The  validity  predicted  from  factorial 
equations  (see  eh.  28)  is  0.16,  which  agrees  well  with  the  obtained  va*- 
lidity. 

In  its  present  form,  the  test  is  slightly  too  easy,  and  its  reliability  is 
only  minimally  satisfactory,  although  it  could  be  a  useful  member  of  a 
battery  of  tests. 

Area  Visualization,  CP815A  * 

This  test  was  designed  as  a  measure  of  the  ability  of  manipulatory 
visualization.  It  is  somewhat  similar  to  Pattern  Assembly,  CP804A. 


FIGURE  17.3 

SAMPLE  ITEMS  OF  AREA  VISUALIZATION. 

CP8I5A 

1  Drvcloprd  *1  P»y<koi*f  k  aJ  ftctcartk  Umi  Nr  1.  Ckitf  (•ntfiWlw :  Tie.  T.  IUk«, 
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Description. — I7or  each  test  Item,  the  examinee  is  required  to  indicate 
which  one  of  three  figures  will  be  formed  when  two  segments  arc  ro¬ 
tated  about  so  that  they  fit  together.  Sample  items  are  shown  in  figure 

17.3. 

(1)  Internal  characteristics.-  -The  test  is  divided  into  two  parts  of  30 
items  each.  One  other  item  is  used  as  an  illustrative  sample  in  the  direc¬ 
tions.  Unlike  the  Pattern  Assembly  test,  segments  need  not  be  turned 
over;  all  that  is  required  is  rotation. 

(2)  Administration. — Each  part  is  allotted  7  minutes,  and  the  direc¬ 
tions  take  another  5  minutes. 

(3)  Scoring. — The  scoring  formula  is  R— W/3. 

Evaluation. — Since  there  are  no  statistical  data  available  for  this  test, 
little  can  be  said  in  evaluation  of  it.  The  test  is  relatively  long,  and  it 
should  have  satisfactory  reliability.  Inspection  of  the  test  items  suggests 
that  length-estimation  variance  is  reduced  sharply,  as  compared  with 
Pattern  Assembly,  CP804A. 

It  seems  doubtful  that  the  test  will  have  much  visualization  variance, 
judging  from  its  similarities  to  the  Pattern  Assembly  test. 

PATTERN  COMPLETION  TESTS 
Mutilated  Words,  CP512A 1 

This  is  one  of  a  battery  of  nine  tests  adapted  from  forms  devised  by 
L.  L.  Thurstone.  These  tests  were  intended  to  measure  visualization 
abilities. 

Description. — Each  item  consists  of  a  mutilated  word,  i.  e.,  a  printed 
word  partially  erased,  and  five  complete  words  labeled  A  through  E.  It 
is  the  task  of  the  examinee  to  recognize  the  incomplete  word  and  to 
select  from  five  alternatives  the  word  ‘.hat  bears  the  closest  relationship 
to  it.  The  completed  form  of  the  mutilated  word  is  not  presented  among 
the  five  alternatives,  but  the  answer  word  is  sufficiently  similar  in  mean¬ 
ing  to  be  easily  recognized. 

Two  sample  test  items  are  shown  in  figure  17.4. 

A-ashes  B-weapon  C -bones 
D-smoe  E- cement 

A-reptile  B~  PILLOW  C-RAOIO 
D-sextant  E-oiamcno 

FIGURE  17.4 

_ SAMPLE  ITEMS  OF  MUTILATED  WORDS,  CP5I2A 

,  *  Pc.v',0^d  »'  Pfrcrptuil  ReMafoh  Unit.  n<-ad<iu»rtcri,  AAF  Tnlnin*  Commin*  Chief 
contributor*:  Opt.  Xichard  H.  Uctincman  and  tuff. 
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(1)  Internal  characteristics. — The  test  is  made  up  of  5  sample  and 
25  scored  items.  They  are  arranged  in  approximate  order  of  difficulty. 

(2)  Administration. — The  over-all  testing  time  is  approximately  9 
minutes,  with  4  minutes  required  for  the  test  proper  and  4  to  5  minutes 
for  directions  and  sample  problems. 

(3)  Scoring. — The  score  on  this  test  is  simply  the  number  of  correct 
responses. 

Statistical  results.  (1)  Distribution  statistics. — Distribution  statistics 
obtained  on  this  test  are  given  in  table  17.3.  The  distribution  curves  are 
moderately  negatively  skewed. 


Table  17.3. — Distribution  constants  for  Mutilated  Words,  CP512A 


Unclassified  aviation  students* 

Classified  pilots*  . 

Classified  pilots*  . . 


N 

u 

460 

17.8 

640 

17.8 

its 

18.2 

iCiUtu  ii*  sunit.ii  i7tj  mi  i  o^viiviVKivni  ncKiivii  omi  *’V. 

•  Tested  in  April  S943  at  Psychological  Research  Unit  No.  1*  In  classes  44A,  44B#  And  44G 

*  Tested  in  August  1943  at  Psychological  Research  Unit  No.  3.  In  class  43K. 


(2)  Test  validity. — Validation  results  based  on  several  samples  are 
given  in  table  17.4. 


Table  17.4. — Validity  data  for  Mutilated  Words,  CP512A,  based  upon  samples  of 
pilot c  in  primary  training ,  graduation-elimination  criterion 


N, 

1 

M, 

M. 

SD, 

r»U 

/in1 

*640 

0.74 

17.92 

17.64 

3.11 

0.0$ 

0.08 

•185 

.86 

18.21 

18.12 

2.71 

.00 

•  •  • 

1  Assumed  unrestricted  stanine  standard  deviation  not  reported.  .  „  ,  „  , 

1  In  classes  44A.  44B,  and  44C.  Tcst-d  in  April  1943  at  Psychological  Research  Unit  No.  I. 
•In  class  43K.  Tested  in  August  1943  at  Psychological  Research  Unit  No.  J. 


Evaluation. — The  paucity  of  data  for  this  test  makes  evaluation  diffi¬ 
cult.  Because  of  the  small  number  of  items,  it  probably  is  not  very  reli¬ 
able,  and  the  distribution  curves  show  that  the  items  arc  relatively  easy. 
The  test  has  almost  no  pilot  validity,  so  we  may  be  rather  sure  that 
none  of  the  factors  known  to  be  valid  for  pilots  is  substantially  repre¬ 
sented.  In  Thurstonc's  analysis  (2)  of  a  15-item  form  of  the  test  (in 
which  the  examinee  responded  directly  by  reading  the  mutilated  word), 
the  principal  loadings  of  the  test  were  in  a  factor  identified  as  "speed 
and  strength  of  closure/'  and  in  another  factor  called  "speed  of  percep¬ 
tion.”  The  latter,  best  defined  in  Thurstonc’s  analysis  by  peripheral  span 
and  dark-adaptation  measures,  probably  is  not  the  same  as  the  perceptual- 
speed  factor  defined  in  the  present  volume. 


Object  Completion,  CP811A* 

This  test  was  designed  specifically  to  measure  the  ability  to  perceive 
the  form  of  an  object  when  only  a  portion  of  its  elements  can  be  seen. 


•Developed  al  Psychological  Research  Unit  No.  3.  Chief  Contributor*:  Teeh/Sgt.  Paul I  C. 
Davis,  Cape.  Richard  II.  llcnneman,  Sgt.  Frederick  H.  Mcue,  and  Tech/Sgt.  San.ord  J.  Mock. 
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It  was  thought  that  perceptual  integration,  or  the  ability  to  recognize 
total  situations  from  partial  impressions,  as  in  the  identification  of  terrain 
from  fleeting,  incomplete  glimpses  through  clouds  or  smoke,  might  be 
an  important  function  in  air-crew  success. 

Description. — The  test  is  a  modification  of  the  Street  Gestalt  com¬ 
pletion  test  (1).  It  consists  of  a  series  of  drawings  of  military  objects 
from  which  many  of  the  parts  are  deleted.  It  is  the  examinee’s  task  to 
select  from  a  list  of  alternative  answers  the  name  of  the  object  which  is 
partially  portrayed  in  the  item. 

(1)  Internal  characteristics. — On  each  double-page  spread,  six  in¬ 
complete  drawings  arc  presented.  With  these  6  pictures  is  a  list  of  13 
possible  answers,  including  the  correct  ones,  misleads,  and  other.  Figure 
17.5  shows  some  items  of  the  test.  The  test  is  divided  into  2  parts  of  30 
items  each.  In  addition,  there  is  one  sample  item  used  in  the  directions. 


asms  az-m 

*•  Off  leaf  a  C»p 

».  M 

e.  ruu 

0.  Ttlafhoot 
I.  ra*kia 
t.  Spark  Plug 
0-  S*p  Mac 


FIGURE  17.5 

SAMPLE  ITEMS  OF  OBJECT  COMPLETION. 

CP8IIA 

(2)  Administration. — The  sample  item  is  explained  and  answered 
before  the  test  is  begun.  Fifteen  minutes  are  allowed  for  each  part.  Ad¬ 
ministration  requires  approximately  5  minutes. 
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(3)  Scoring. — The  scoring;  formula  is  R— W/4. 

Statistical  results.  (I)  Distribution  statistics. — In  table  17.5  arc  shown 
distribution  data  for  unclassified  aviation  students  and  for  classified 
pilots.  The  distribution  curves  are  slightly  positively  skewed. 


Tabl2  17.5, —  Distribution  constants  for  Object  Completion ,  CPS11A 


Group 

Part 

Score 

N 

U 

SD 

Classified  pilots1  . . . . . 

I  . 

R 

S6J 

11.6 

6.1 

Do*  . . . 

I  . 

R 

1,510 

12 .1 

S.3 

Do1  . . . . 

II  . 

R 

’  56  J 

12.1 

6.1 

Do*  . 

ii  . 

R 

1,110 

14.1 

s.i 

Unclassified  aviation  students*  . . . 

I  and  11  .... 

R 

500 

27.4 

9.6 

Classified  pilots* . 

I  and  II  .... 

R 

119 

26.4 

9.4 

Do*  . , . 

I  . 

W 

1,110 

10.3 

S.7 

Do*  . 

II  . . 

W 

1,110 

8.6 

S.I 

Do*  . 

W 

119 

18.1 

10.4 

Unclassified  aviation  students*  .... 

I  and  11  .... 

W 

500 

17.1 

10.0 

*  In  class  441.  Tested  in  April  1944  at  Psychological  Research  Unit  No.  3. 
'Tested  at  Psychological  Research  Unit  No.  1  in  May  1944. 


(2)  Internal  consistency. — The  degree  of  homogeneity  of  the  items 
is  indicated  by  a  mean  internal-consistency  phi  of  0.37,  with  a  standard 
deviation  of  0.14  and  a  range  of  values  from  0.10  to  0.66.  These  statis¬ 
tics  are  based  upon  the  upper  27  percent  and  the  lower  27  percent  of  a 
group  of  570  unclassified  aviation  students,  tested  in  May  1944  at  Psy¬ 
chological  Research  Unit  No.  3. 

(3)  Reliability  coefficients. — Samples  to  which  this  test  was  adminis¬ 
tered  yielded  the  estimates  of  reliability  given  in  table  17,6. 


Table  17.6.—  Estimated  reliability  coefficients  for  Object  Completion,  CPSUA, 
based  upon  samples  of  classified  pitot f 


N 

Variables 

''ir 

r n 

p„i  i  m  „  Pin  ii  /Ri*  . . 

0.68 

041 

561 

p,r*  I  v  Part  II  ( It)  . . 

.71 

46 

>  In  class  4JJ.  Tested  at  Psychological  Research  Unit  No.  )  in  April  and  May  1944,  Orrr- 
lapping  samples.  .  ..  .  ,  .  „  l 

»  (R)  indicate*  a  aimple  "number  right  score. 


(4)  Correlation  belivcen  rights  and  wrongs. — Based  upon  a  sample  of 
500  unclassified  aviation  students  tested  at  Psychological  Research  Unit 
No.  3  in  May  1944,  the  correlation  between  total  rights  and  total  wrongs 
was  —0.37. 

(5)  Difficulty. — Based  upon  the  responses  of  750  unclassified  avia¬ 
tion  students,  the  test  yielded  a  mean  proportion  of  correct  responses 
of  0.56,  with  a  standard  deviation  of  0.16  and  a  range  from  0.05  to 
0.81.  This  sample  was  tested  in  May  1944  at  Psychological  Research 
Unit  No.  3.  Correction  for  chance  success  was  not  attempted,  because 
the  test  is  a  matching  test  with  progressive  elimination  of  the  number  of 
alternative  responses. 
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(6)  Test  validity. — Validation  results  based  on  several  samples  arc 
given  in  table  17.7. 

Table  17.7. —  Validity  data  fo.  Object  Completion,  CP8UA,  based  upon  samj'ts  of 
pilots  in  primary  training,  graduation-elimination  criterion 


Store 

N, 

t. 

M. 

M. 

SD, 

Pan  I  (R)  .. 

>139 

0.7(5 

13.04 

11.28 

5.2 

0.20 

0.26 

Part  I  (R)  .. 

•1,310 

‘339 

.90 

12.73 

12.76 

5.46 

-.005 

.07 

Part  I  <W)  . 

.76 

9.24 

10.19 

5.7 

-.10 

-.11 

Part  1  (W)  . 

*1,310 

.90 

10.14 

10.13 

5.73 

.02 

-.03 

Part  It  (K) 

*339 

.76 

14.07 

13.08 

5.1 

.It 

.16 

Part  It  (K) 

•1.310 

.90 

14.31 

14.00 

5  29 

.03 

.10 

Part  II  (W) 
Part  It  <W) 

•339 

.76 

8.52 

8.96 

5.4 

-.05 

-.05 

•1.310 

.90 

8.59 

8.91 

5.12 

-.0J 

-.06 

Total  <R>  .... 
Total  (W)  ... 

•339 

76 

27.11 

24.36 

9.3 

.17 

.2J 

*339 

.76 

17.76 

19.15 

10.3 

-.08 

-.09 

•  AaMimin*  an  unrtMritled  stinine  standard  deviation  of  2.00. 

*Te»te«l  Mav  It  to  May  IS.  1914.  at  I’syrliotoijical  Research  Unit  No.  3. 
'Tested  in  April  1944  at  1'iythuloKical  Research  Unit  No.  3.  In  clast  44J. 


Evaluation. — This  test  docs  not  have  much  promise  as  a  pilot  selection 
instrument.  Its  reliability  and  difficulty  level  arc  satisfactory.  In  Thur- 
stone’s  analysis  (2)  of  perceptual  tests,  another  modification  of  the 
Street  Gestalt  completion  test  has  prominent  loadings  in  two  perceptual 
factors:  speed  and  strength  of  closure,  and  speed  of  perception. 

PATTERN  ANALYSIS  TESTS 
Pattern  Analysis,  CP512A  1 

This  is  a  variant  of  the  Gottschaldt  concealed-figure  test  utilizing  only 
one  standard  figure  (a  capital  Greek  sigma)  embedded  in  various  com¬ 
plex  designs.  The  test  was  designed  to  measure  the  ability  to  form 
figure-ground  relationships,  and,  like  Mutilated  Words  discussed  above, 
is  one  of  the  battery  of  nine  tests  designed  for  the  analysis  of  visualizing 
abilities. 

Description. — The  task  of  the  examinee  is  to  detect  the  outline  of  the 
standard  figure  in  a  complex  design.  Figure  17.6  shows  the  standard 
figure  and  five  sample  items.  The  standard  figu-c  can  be  detected  in 
alternatives  A,  C,  and  D.  (In  the  actual  test,  of  course,  the  standard 
figure  is  not  shown  with  every  set  of  items.  It  is  shown  only  once,  in 
the  directions  to  the  test.) 

(1)  Internal  characteristics.-  -Test  instructions  include  a  showing  of 
the  standard  figure,  18  exemplary  items,  and  15  practice  items.  The 
test  proper  consists  of  1,035  items,  arranged  in  blocks  of  five.  If  the 
design  contains  the  standard  figure,  the  examinee  is  required  to  fill  in 
the  appropriate  answer-space;  if  it  does  not,  the  examinee  leaves  the 
answer-space  blank.  The  standard  figure  actually  appears  in  104  of  the 
items. 

(2)  Administration.  Administration  requires  approximately  5  min¬ 
utes,  with  an  over-all  testing  time  of  17  minutes.  In  the  original  admin- 

*  DrtrluprJ  it  Prici-i-tiial  itr^car,  h  Unit,  Hr  ..Uiuartrra.  AAF  Training  Command.  Ckirt 
vontnbutwri:  Cap*.  Kuhai J  II.  Mcruituun  ami 
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STANDARD  rtDURC 


A  B  C  0  C 


FIGURE  17.6 

STANOARO  FIGURE  &  3AM PLE  I  TEMS  OF  PATTERN 
ANALYSIS,  CP5I2A 

istration  of  these  tests,  however,  15  minutes  were  allowed  for  the  first 
735  items. 

(3)  Scoring. — The  scoring  formula  is  R— VV. 

Statistical  results.  (1)  b  stribution  statistics. — In  table  17.8  arc  pre¬ 
sented  distribution  constants  for  two  samples  of  classified  pilots. 

Table  17.8. — Distribution  constants  for  Pattern  Aiutlysis,  CP512A,  based  upon 

samples  of  pilots 


N 

M 

SD 

•}92 

$9.1 

lt.7 

*640 

64.4 

14.6 

1  Tested  at  Psychological  Research  Unit  No.  X  CLsa 
not  reported. 

'Tested  at  Psychological  Research  Unit  No.  I  in 
April  194}  in  classes  44A,  441,  and  44C. 


(2)  Reliability  coefficient. — By  the  alternate- forms  method,  an  esti¬ 
mated  reliability  coefficient  of  0.91,  corrected  for  length,  was  obtained. 
This  figure  is  based  on  a  sample  of  438  unclassified  aviation  students  and 
on  the  administration  of  separately  timed  halves  of  the  test. 

(3)  Factorial  composition. — The  most  significant  loadings  arc  in  the 
visual-memory  (0.35),  perceptual-speed  (0.26),  numerical  (0.26),  and 
visualization  (0.18)  factors.  The  commonality  is  0.35.  For  these  factor 
analyses,  only  the  first  735  items  of  the  test  were  administered.  (The 
estimated  reliability  is  0.87,  based  on  460  unclassified  aviation  students 
and  on  administration  of  separately  timed  halves.)  For  a  fuller  picture 
of  the  factorial  composition  of  this  test,  see  Appendix  B. 

(4)  Test  validity. — A  sample  of  640  pilots  (classes  44A,  B  and  C; 
see  table  17.8)  yielded  a  biserial  correlation  of  0.16,  corrected  for  re- 
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strict  ion  of  range,  bciuccn  performance  in  this  test  and  the  graduation- 
ciunination  criterion  from  primary  training.  The  mean  score  for  gradu¬ 
ates  was  65.30,  for  climinco  62.00;  and  the  standard  deviation  for  both 
coinhincd  was  16  61.  Of  this  sample,  74  percent  were  graduated,  and 
the  standard  deviation  assumed  for  the  unrestricted  pilot  staninc  distri¬ 
bution  was  2.00 

Li'dludlion." -This  test  is  highly  reliable,  but  its  validity  for  pilots  is 
low.  Only  35  percent  of  the  total  test  variance  is  accounted  for  by  the 
factors  so  far  identified  in  the  test,  which  leaves  a  very  large  amount 
of  undefined  nonerror  variance. 

The  identified  factors  account  for  the  following  percentages  of  the 
total  variance  of  the  test:  Visual  memory,  12  percent;  perceptual  speed, 
7  jiercent;  numerical,  7  percent;  and  visualization,  3  percent.  No  other 
factor  accounts  for  more  than  2  percent  of  the  test’s  variance. 

Based  on  factorial  equations  (see  eh.  28),  the  predicted  pilot  validity 
is  0.15,  which  indicates  that  all  the  test's  pilo*  validity  is  accounted  for. 

Camouflaged  Outlines,  CP821A  * 

This  is  the  final  variation  of  the  Gottschaldt  figures  test  developed  in 
the  AAF  program.  An  earlier  form  had  been  developed  before  the  war 
by  Col.  J.  P.  Guilford.  It  was  designed  for  inclusion  in  a  special  experi¬ 
mental  study,  which  was  interrupted  by  the  end  of  the  war.  While  no 
statistical  data  are  available,  its  description  is  of  sonic  interest,  since 
experience  with  previous  experimental  forms  of  this  type  of  test  deter¬ 
mined  its  specific  characteristics. 

An  original  nonmachine-scorable  form  of  the  Gottschaldt  figures  test 
was  administered  as  Hidden  Figures,  Cl'SOZA.  For  a  group  of  652 
pilots,  this  test  correlated  0.36  with  graduation-elimination  through  ad¬ 
vanced  training.  Following  this  lead,  extensive  investigations  were  made 
into  the  usefulness  of  this  type  of  test  by  those  responsible  for  the  de¬ 
velopment  of  the  AAF  Qualifying  Fxaniination.* 

The  results  indicated  (a)  that  very  easy  items  should  be  avoided,  (6) 
that  the  test  had  low  to  moderate  pilot  validity  (ranging  from  0.17  to 
0.25  in  different  samples),  (c)  that  satisfactory  results  were  attained 
with  cither  closely  timed  or  essentially  untimed  administration,  and  ( d ) 
that  an  extensive  practice  period  was  necessary. 

Iti  developing  Camouflaged  Outlines,  CTS21A,  all  previously  con¬ 
structed  test  items  were  carefully  scrutinized.  Many  were  used  in  their 
original  form,  others  were  modified,  and  some  were  constructed  especially 
for  the  purposes  of  this  test.  An  attempt  was  made  to  have  all  items 
close  to  the  50  percent  difficulty  level,  using  a  priori  judgments,  available 
item  statistics,  and  pretesting  with  members  of  the  test-construction  staff. 

Dcscriftion. — As  in  other  tests  of  this  type,  the  examinee  is  required 

•Iltul  iH  «t  I'-ytHolojiol  Rn<»iik  Unit  No.  2.  Chief  contributor*:  Copt.  John  I.  Loccr. 
Jeanette  ft.  HuikIL 

•Thro*  teiiiti*  ore  not  t"»<  ••>»«,!  in  tbit  repot!.  For  *  detailed  dioctmion,  ice  Repot!  No.  ( 
ta  Out  tenet,  TF»  AAF  Q«jJi/)n»p  Em  minor***. 
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FIGURE  17.7 

STANOARO  FIGURES  &  TEST  ITEMS  OF  CAMOUFLAGED 
OUTLINES,-  CP82IA 

% 

to  detect  a  simple  figure  within  a  complex  design.  Two  sets  of  four 
standard  figures  each  arc  provided.  Figure  17.7  shows  in  the  top  pane! 
one  set  of  standard  figures,  and,  in  the  lower  panel,  four  test  items.  The 
examinee  is  required  to  indicate  which  of  the  standard  figures  is  in 
eluded  in  each  complex  design. 

(1)  Internal  characteristics. — Extensive  directions,  including  prae  . 
tice  items,  are  utilized.  The  examinee  is  shown  one  item  with  a  heavy 
outline  showing  a  standard  figure.  lie  then  attempts  four  practice  items. 
After  these  four  items  arc  answered,  the  correct  answers  arc  given  to 
him,  again  utilizing  heavy  outlines  of  the  standard  figures.  The  test 
proper  is  divided  into  2  parts  of  16  items  each.  In  the  first  part,  the  first 
eight  items  involve  the  first  set  of  standard  figures,  the  second  eight 
items,  the  second  set  of  standard  figures.  In  part  II  the  order  of  presen¬ 
tation  of  the  standard  figures  is  reversed,  thus  providing  an  ABBA 
order. 

(2)  Administration. — Five  and  one-half  minutes  are  allowed  for  each 
part,  with  approximately  an  additional  5  minutes  required  for  adminis¬ 
tration  of  the  directions. 

(3)  Scoring. — The  scoring  formula  is  R  —  W/J. 
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Penetration  of  Camouflage,  CP812A  *• 

This  test  was  adapted  from  Thurstone’s  Hidden  Pictures  test. 

Description. — The  test  consists  of  six  page-size  drawings  depicting 
military  activities.  Concealed  faces  are  worked  into  the  context  of  each 
scene.  The  faces  may  lx*  front  view,  profile,  right  side  up,  or  upside 
down.  If  necessary  to  detect  the  faces,  the  examinees  may  turn  the  l>ook- 
ict  in  any  direction.  The  left  and  right  borders  of  the  pictures  are 
blocked  off  into  five  sections,  which  correspond  to  five  numbered  item 
spaces  on  the  IBM  answer  sheet.  The  upper  and  lower  borders  arc 
blocked  off  into  five  equal  sections  which  correspond  to  the  five  lettered 
alternatives  for  each  item-number.  The  task  of  the  examine !  is  to  detect 
the  camouflaged  faces  in  each  picture  and  to  mark  their  location  in 
terms  of  the  item-alternative  coordinates.  Figure  17.8  shows  the  sample 
scene  used  in  the  administrative  directions.  In  this  sample,  some  of  the 
concealed  figures  are  cnciicled. 


FIGURE  17.8 

SAMPLE  DRAWING  OF  PENETRATION  OF 
CAMOUFLAGE,  CP8I2A 

(l)  Administration. — The  examinees  are  told  that:  “This  is  a  test 
of  your  ability  to  detect  camouflaged  figures  •  *  *  do  not  indicate  on 
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your  answer  sheet  the  obvious  faces  of  people  in  the  pictures.  *  *  •'* 
The  faces  in  the  first  two  item  sections  of  the  sample  scene  are  encir¬ 
cled,  and  the  examinees  arc  told  to  detect  those  in  the  last  three  sections. 
After  this  is  done,  the  locations  of  the  faces  are  pointed  out  Two  and 
one-half  minutes  are  allowed  for  each  picture.  At  the  end  of  that  time 
the  examinees  are  told  to  start  on  the  next.  The  approximate  total  test¬ 
ing  time  is  25  minutes.  The  first  3  pictures  are  considered  to  constitute 
part  I,  and  the  last  three,  part  II,  with  36  and  35  concealed  faces  re¬ 
spectively. 

(2)  Scoring. — Rights  and  wrongs  are  scored  separately  for  each  part 

Statistical  results.  (1)  Distribution  statistics. — Distribution  statistics 
obtained  for  this  test  are  given  in  table  17.9. 

Table  17.9. —  Distribution  constants  for  Penetration  of  Camouflage,  CPS  12 A, 
based  upon  a  sample  of  773  pitot/ 


Score 

M 

SD  ' 

41.4 

S.J 

2.2 

19 

*  Toted  June  12  to  Aug.  11,  1944,  at  PijrcMacteil 
Rcsearck  Unit  No.  4. 


(2)  Internal  consistency. — The  degree  of  homogeneity  of  the  items 
of  the  test  is  indicated  by  a  mean  internal-consistency  phi  of  0.32  and 
a  range  of  values  from  0.00  to  0.58.  These  statistics  are  based  upon  the 
responses  of  the  highest  27  percent  and  the  lowest  27  percent  in  total 
score  of  a  group  of  740  unclassified  aviation  students,  tested  in  June 
1944  at  Psychological  Research  Unit  No.  3. 

(3)  Reliability  coefficient. — A  sample  to  which  this  test  was  adminis¬ 
tered  yielded  the  estimates  of  reliability  given  in  table  17.10. 


Table  17.10. —  Estimated  reliability  coefficients  for  Penetration  of  Camouflage, 
CPS12A,  based  upon  a  sample  of  773  pilot x*  in  primary  training 


Variable* 

*** 

rn 

0.4A 

0X1 

.55 

.71 

'  Tested  Junt  12  to  Aug.  II,  1944.  at  P.ychologKal  Rorartk  Ukt  Ne.  2. 


(4)  Correlations  of  rights  and  wrongs — The  intercorrclations  of 
rights  and  wrongs  are  shown  in  table  17.11. 

Table  17.11. —  Intercorrelations  of  rights  and  wrongs  of  Penetration  of  Camouflage, 
CPS12A,  for  a  sample  of  773  pilots  in  primary  training? 


1 

2 

) 

4 

1.  Part  1  n*bt*  . 

—0.92 

9.05 

2.  Tart  it  rigkt,  . 

o.ia 

„  .  , 

.04 

.11 

s  Part  1  wrong*  . 

-.*>2 

.09 

... 

.55 

4.  Part  It  wrenga  . 

.05 

.11 

.51 

*  *  * 

'  S«  foot  nett  I,  takk  12.1®. 
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(5)  Difficulty. — Rased  upon  item  analysis  of  the  responses  of  750 
unclassified  aviation  students,  tested  in  June  1944  at  Psychological  Re¬ 
search  Unit  No.  3,  the  test  yielded  a  mean  proportion  of  correct  re¬ 
sponses  of  0.68,  with  a  range  from  C.15  to  0.99  and  a  standard  deviation 
of  0.21. 

(6)  Test  validity. — Validation  results  based  cn  a  sample  of  773 
pilots  in  elementary  training  arc  given  in  table  17.12. 

Table  17.12.—  Validity  Jala  for  Penetration  of  Camouflage,  CPS  1 2.  t,  for  pilots 
in  primary  training,  graduation-elimination  criterion  (N >  —  773') 


Pm  » 

Score 

M, 

S», 

rM# 

H 

MM| 

22.96 

20.48 

.74 

1.37 

43.44 

2.11 

23.12 

20.21 

.99 

1.43 

43.33 

2.42 

5.13 
4.17 
1.19 

2.13 
8.S3 
2.9$ 

■ 

0.02 

.10 

-.1$ 

-.04 

.06 

-M 

it  . 

]  mil  11  . 

I  and  11  . 

1  Set  (Datable  I,  Ul>!e  1 7.  Id. 

’  Anuming  an  uiriiliicltd  Miiiim  aunjard  ieviation  of  2.00. 


Evaluation. — Both  rights  and  wrongs  for  this  test  have  satisfactory 
reliability,  and  they  apparently  measure  different  functions.  Neither 
score  has  much  validity  for  the  pilot  criterion. 

Camouflaged  Figures*  CP810A  11 

This  test  is  one  of  the  scries  designed  to  measure  the  ability  to  dis¬ 
tinguish  a  pattern  from  a  confused  background. 

Description. — The  test  items  consist  of  capital  letters  outlined  in  dots 
and  surrounded  by  other  dots  which  disrupt  the  pattern  of  the  letters. 
The  task  of  the  examinee  is  to  distinguish  the  letter  pattern  from  its 
confused  background.  The  answers  arc  recorded  on  a  special  IBM  an¬ 
swer  sheet,  with  space?  for  all  letters  of  the  alphabet  except  Q.  There 
is  only  one  letter  in  each  design,  and  it  is  always  right  side  up.  Three 
sample  items  are  presented  in  figure  i 7.9. 

A8CDEPGHIJKLMNOPRSTUVWXY2 


FIGURE  17.  t 

CAMPLE  ITEMS  of  CAMOUFLAGE 0  FIGURES 
CP8KJA 


"  Dt»dit»4  l<  Pill  S«l«txal  Unit  X#  ».  OuW  ronlrtWalOTt:  C"*p4  Slturt  W. 

Coat,  l«i  Cl  Wri*Sl, 
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(1)  Internal  characteristics. — The  test  is  made  up  of  2  parts  of  30 
items  each.  There  arc  three  practice  problems  at  the  beginning  of  the 
test 

(2)  Administration. — Thirteen  minutes  arc  allowed  for  part  I  and 
11  minutes  for  part  II.  Administration  cf  directions  requires  approxi¬ 
mately  5  minutes. 

(3)  Scoring. — Rights  and  wrongs  are  scored  separately. 

Statistical  results.  (!)  Distribution  statistics. — Typical  examples  of 

distribution  statistics  obtained  on  this  test  are  given  in  table  17.13. 

Table  17.13. —  Distribution  constants  for  Camouflaged  Figures,  CP8I0.I,  based 


upon  a  sample  of  pilots  in  primary  training* 


Score 

N 

M 

SD 

1,310 

1,330 

32.$ 

9.2 

12.7 

7.6 

‘Tested  in  Much  1944  it  Ps/rioiofical  Research  Unit  No.  1  in  cliss  441. 


(2)  Reliability  coefficients. — Samples  to  which  this  test  was  adminis' 
tered  yielded  the  estimates  of  reliability  given  in  table  17.14. 

Table  17.14. —  Estimated  reliability  coefficients  for  Camouflaged  Figures, 
CP8WA,  based  upon  samples  of  pilots 


i  Variable! 

N 

Tart  I  right!  v.  Part  II  right! . 

*760 

0.69 

Part  I  » rongi  if.  Part  II  wrong!  . 

*77$ 

.40 

Part  I  right!  v.  Part  II  right! . 

•169 

.70 

Part  I  wrong!  v.  Part  II  wrong! . 

! 

<169 

.49 

1  Ttjltd  in  Starch  1914  at  Psychological  Research  Unit  No.  J.  In  clan  44 1 
‘Tested  April  10  and  11,  1944,  at  Psychological  Research  Unit  No.  J. 


(3)  Difficulty. — Based  upon  the  responses  of  733  pilots,  the  test 
yielded  a  mean  proportion  of  correct  responses  of  0.72,  with  a  range 
from  U.uO  to  0.99  and  a  standard  deviation  of  0.20.  These  statistics  are 
not  corrected  for  chance  success,  smcc  there  is  only  1  chance  of  25  of 
securing  the  correct  answer  by  guessing. 

(4)  Test  validity. — Validation  results  arc  given  in  table  17.15. 


Table  17.15. — Validity  data  for  Camouflaged  Figures,  CI’SIOA,  based  upon 
samples  of  pilots  in  primary  training,  graduation-elimination  criterion 


H 

Part 

Scoring 

formula 

n 

M. 

SI), 

r... 

•r*i/ 

*1.330 

I  and  II  .  .  . 

Nights  . . . 

0  *6 

12  72 

31.36 

9  15 

0  0* 

0.17 

*1.1*0 

1  and  11  ... 

Wrongs  .  . 

.96 

12.70 

12  64 

.01 

—  .05 

*169 

i  . 

Uirhts  . .  . 

.70 

1  >.M 

13.44 

SKfJfl 

.16 

.21 

•169 

i  . 

Wrong!  .  . 

70 

ill 

6  76 

9H 

-.11 

-.11 

•169 

ii  . 

Rirbti 

.70 

15.75 

14.56 

5  2 

.14 

.23 

•169 

ii  . 

Wrongs  .  . 

.70 

3.92 

3.92 

.00 

-  01 

*169 

I  and  11  -  . 

Rights  . .  . 

70 

31  39 

29  00 

Bui 

.11 

.26 

*169 

1  and  11  .  .  . 

Wrongs  . 

.70 

10.03 

10  61 

■a 

-.0* 

-.09 

*  Assuming  an  unre\tr»cte«!  staninr  *fan»!ar«l  deviation  of  2VQ. 

•In  etui  4  4V.  Tnlftl  tn  Mirth  |‘)M  at  ISytholoridl  Rcvjrch  Urwt  No.  J. 
•Toted  April  10  and  II,  194  1,  at  Psychological  Krwiiik  Unit  No.  3. 


(5)  Item  validity. — Validation  of  items  revealed  a  mean  phi  of  0.01, 
with  a  range  from  —0.11  to  f-0.24  and  a  standard  deviation  of  0.07, 
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based  upon  the  responses  of  600  graduates  and  133  eliminees  from  pri¬ 
mary  pilot  training. 

Evaluation  -  -This  test  has  satisfactory  reliability  and  moderate  pilot 
validity.  The  average  difficulty  level  of  the  items  is  rather  low. 

Ability  to  Listen  in  Noise,  CP704BX2 

This  test  was  designed  to  measure  the  ability  to  hear  instructions 
through  noise.  Inability  to  hear  instructions  and  comments  through  the 
intercommunication  systems  of  training  airplanes  was  frequently  rc- 
poited  as  a  difficulty  and  potential  cause  of  elimination  in  early  phases 
of  training.  The  lir.-,t  form  of  this  test  was  developed  by  the  Harvard 
Psycho-Acoustic  Laboratories,'2  anil  the  phonograph  records  on  which 
this  test  was  recorded  were  provided  by  that  organization. 

Description. — This  test,  involving  the  ability  to  hear  spoken  words 
above  a  noise  screen  similar  to  that  made  by  a  twin-engine  bomber,  is 
recorded  on  phonograph  records,  and  it  involves  20  sections  of  10  words 
each.  There  are  also  two  practice  sections  of  10  words  each  at  the  begin¬ 
ning  of  the  test.  The  task  of  the  examinee  is  to  select  the  word  spoken 
from  five  alternate  choices,  and  to  enter  it  on  his  answer  sheet.  Two  sam¬ 
ple  items  arc  as  follows : 

A  R  C  D  E 

41.  hurt  church  perch  first  none  of  these. 

42.  platter  clatter  clutter  flatter  none  of  these. 

For  41,  the  spoken  word  was  “hurt";  for  42,  the  spoken  word  was 
“flatter.”  The  alternative  \vm  -  are  the  most  popular  incorrect  re¬ 
sponses  that  were  made  by  a  gioup  of  100  unclassified  aviation  cadets 
who  wrote  down  what  they  heard. 

Part  I  consists  of  20  practice  items  and  100  scored  items;  part  II 
consists  of  100  scored  items. 

(!)  Administration. — The  examinees  are  told  that:  "This  is  an  audi¬ 
tory  test  of  your  ability  to  hear  and  identify  words  above  the  roar  of 
an  airplane  engine.  *  *  *  As  each  word  is  spoken,  you  will  look  over  the 
five  possible  answers  *  *  *  and  select  the  word  you  heard  spoken.  If 
the  spoken  word  does  not  appear,  your  answer  will  be  K  or  none  of 
thc-c."  The  total  toting  time  is  approximately  35  minutes. 

(2)  Scoring.  The  scoring  formula  is  R--W/5. 

Stotistieal  results.  (!)  Distribution  slat  sties.  Table  17.16  presents 
distribution  data  for  this  test. 

11  Kim  form  developed  by  Hi*.  S.  S.  Stevens  and  t..  I..  Her. nick.  The  records  and  accessory 
ten  materials  were  kindly  made  available  to  the  Army  Air  Corps  hy  Dr.  Stevens  (or  expert- 
mental  use. 
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Table  17.16.— Distribution  data  for  Ability  to  Listen  in  Noise,  CP704BX2,  based 


upon  samples  of  classified  pilots 


N 

Part 

Score 

M 

SD 

'610 

I  . 

R-W/S 

4S.3 

9.J 

'500 

I  . 

R 

S4.J 

M 

'500 

11  . 

R 

SI.S 

10.  S 

'500 

I  and  II  . 

R 

105.9 

16.4 

'500 

1  . 

W 

44.2 

8.0 

'500 

II  . 

\V 

47.1 

9.6 

*500 

1  and  II  . 

w 

91.3 

IS.3 

'704 

I  and  II  . 

R-W/S 

94.6 

17.1 

'1,122 

I  . . . 

R-W/S 

44.2 

9.9 

1  In  class  44H,  tested  at  Psychological  Research  Unit  No.  3  in  February  1944. 

’  Tested  at  Psychological  Research  l'nit  No.  S  in  June  1944, 

’In  classes  44H  and  441,  tested  at  Psychological  Research  Unit  No.  3,  February  and  June  > 
1944. 

(2)  Reliability  coefficients. — Samples  to  which  this  test  was  adminis¬ 
tered  yielded  the  estimates  of  reliability  given  in  tab**;  17.17.  The  low 
correlations  between  halves  of  part  11  is  due  to  the  inferior  quality  of 
the  first  side  of  the  second  record.  In  the  sample  of  500,  this  defect  was 
remedied. 


Table  17.17. — Alternate- forms  reliability  coefficients  for  Ability  to  Listen  in  Noise, 
CP704BX2,  based  upon  samples  of  classified  pilots 


N 

Variables 

^11 

'if 

>403 

Part  I  v.  Part  IIA*  . 

0.28 

0.44 

*403 

Part  I  v.  Part  IIB . 

.43 

.60 

•1,564 

Part  I  t>.  Part  11  . 

•SI 

.68 

«40J 

Part  IIA'  V.  Part  IIB  . 

.28 

.44 

'500 

Part  1  v.  Part  II  (rights  only)  . 

.46 

.63 

•SOO 

Part  I  v.  Part  II  (wrongs  only)  . 

.SI 

.67 

1  Class  4411,  tested  at  Psychological  Research  Unit  No.  3,  February  19t4. 
’  Side  A  of  part  II  was  technically  inferior. 

*  Class  44l,  tested  at  Psychological  Research  Unit  No.  3,  June  1944, 


(3)  Correlation  between  rights  and  wrongs. — For  a  sample  of  500 
pilots  (class  441,  tested  in  June  1944  at  Psychological  Research  Unit 
No.  3),  the  correlation  between  rights  and  wrongs  was  —0.88. 

(4)  Difficulty. — Based  upon  analysis  of  the  responses  of  750  pilots, 
the  test  yielded  a  mean  proportion  of  correct  responses  of  0.57,  corrected 
for  chance  success,  with  a  range  from  0.00  to  0.95  and  a  standard  devi¬ 
ation  of  0.22. 

(5)  Test  validity. — Validation  results  are  presented  in  table  17.18. 


Tabi.f.  17.18. — Validity  data  for  Ability  to  Listen  in  Noise,  CP704BX2,  based  upon 


samples  of  pilots  in  primary  training,  graduation-elimination  criterion 


Part 

N, 

M. 

SI), 

fM» 

An' 

1»  . 

’610 

0.82 

45.S 

44.2 

9.S 

0.07 

0.10 

1  and  II*  ... 

•704 

.84 

95.10 

92.33 

17.80 

.09 

.12 

l>  . 

•1,122 

.87 

44.28 

43.58 

9.86 

.04 

.07 

1  Assuming  an  unrestricted  slanine  standard  deviation  of  2.00, 

'Part  II  omitted  because  a  large  proportion  of  examinee*  was  subjected  to  technically  in* 
ferior  record. 

•  In  class  4411,  tested  at  Psychological  Research  Unit  No.  3,  February  1944. 

'  Both  parts  properly  administered. 

'  In  classes  4411  and  441,  tested  at  Psychological  Research  Unit  No.  3,  February  and  June  1944. 


703320 — 47—29 
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(6)  lU'in  validity. —  Valid  it  ion  of  it-m.s  revealed  a  mean  phi  of  0.01, 
breed  open  dx  responses  of  750  graduates  and  68  climinees  from  pri¬ 
mary  pilot  training  it:  class  d  ill  Hit  standatd  deviation  of  phi  values 
was  0.08,  and  die  range  was  from  --0.18  to  0.23. 

(7)  Shniy  oj  the  effects  of  .r,>a'tn^.--Th:  effect  of  the  acoustics  in  the 
testing  room  was  considered  of  potential  importance  as  a  factor  influenc¬ 
ing  scores  on  (lie  test.  The  test  was  administered  in  two  different  build¬ 
ings.  A  sample  of  308  unclassified  aviation  cadets  showed  a  mean  score 
of  42.8  and  a  standard  deviation  of  9.7  for  part  I  in  building  A.  A 
sample  of  159  unclassified  aviation  cade's  showed  a  mean  score  of  46.4 
and  a  standard  deviation  of  9.2  for  part  I  in  building  B.  Although  the 
difference  between  the  means  of  the  two  buildings  is  significant  at  the 
1  percent  level,  consideration  of  the  wide  range  of  scores  indicated  that 
there  would  be  little  effect  upon  the  validity  of  the  test. 

It  was  also  found  that  building  A  had  less  standard  acoustical  con¬ 
ditions  than  R.  For  statistical  analysis,  the  room  in  building  A  was 
divided  into  quadrants.  It  was  found  that  the  left  rear  quadrant  showed 
a  mein  of  39.5  and  a  standard  deviation  of  8.4,  which  is  significantly 
lower  (at  the  1  percent  level)  than  the  average  of  the  other  three  (mean 
of  43.5,  and  standard  deviation  of  9.8).  No  differences  among  the  quad¬ 
rants  were  found  for  building  B.  Building  A  had  only  three  speakers, 
as  opposed  to  building  B  with  four. 

Variations. — There  are  two  prelim-nary  forms  of  this  test,  CP704A 
ar.d  CP704BX1.  AH  the  forms  use  the  same  records,  differing  only  in 
answer  1>  loklets  and  directions.  As  was  mentioned  above,  the  A  form 
was  developed  bv  the  Harvard  Psycho- Acoustic  Laboratories.  The  di¬ 
rections  and  booklet  for  this  test  require  the  examinee  to  write  the  word 
that  lie  beard.  This  test  was  administered  to  100  unclassified  aviation 
cadets,  and  the  most  frequently  appearing  incorrect  answers  were  se¬ 
lected  for  use  in  the  construction  of  the  first  multiple  choice  form, 
CP704BX1.  The  use  of  tins  form  necessitated  extensive  changes  in  the 
administrative  directions.  With  the  exceptions  of  minor  changes  in  the 
directions  and  misleads,  fotm  BX2  is  identical  with  BX1. 

Evaluation.-  -Performance  on  this  test  is  apparently  easily  influenced 
by  environmental  conditions.  Roth  the  acoustics  of  the  test  room  and  the 
quality  of  tin  recording  must  be  standardized.  Under  conditions  of 
large-scale  administration,  the  test  lias  moderate  reliability.  Because  of 
its  negligible  pilot  validity,  however,  further  development  of  the  test  was 
not  undertaken. 


ILLUSIONS  TESTS 

The  development  of  these  tests  was  prompted  by  two  reasons,  one 
general,  and  one  specific. 

The  ccific  reason  was  that  several  valid  tests  of  size  and  distance 
estimation  seemed  to  be  contaminated  with  the  presence  of  illusory  ef- 
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fects,  e.  g.,  the  Estimation  of  Length  test  (see  eh.  18).  It  was  felt  desir¬ 
able,  therefore,  to  construct  separate  tests  that  would  attempt  to  measure 
resistance  (or,  conversely,  susceptibility)  to  illusions,  in  a  deliberate  and 
systematic  manner. 

In  general,  it  was  thought  desirable  to  undertake  an  extensive  investi¬ 
gation  of  objectivity  of  perception,  i.  e.,  correspondence  of  the  perceived 
dimensions  of  objects  with  physically  measured  dimensions.  The  project 
was  begun  quite  late  in  the  program,  however,  and  only  few  data  arc 
available. 

Objectivity  of  Perception,  CP806CX1  and  CP806CX2  11 

These  two  tests  both  utilize  familiar  geometric  illusions,  such  as  the 
Miillcr-I.ycr,  Poggendorf,  Ponzo,  Sanders  pr  -allelograni,  vertical-hori¬ 
zontal,  filled  v.  unfilled  space,  equal  squares,  Titchener’s  circles,  and  un¬ 
named  variants  of  these. 

(1)  Internal  characteristics. — Form  CP806CX1  consists  of  80  items. 
Five  types  of  illusions  arc  represented  by  10  items  each,  and  2  other 
illusions  by  15  items  each.  Form  CPS06CX2  contains  70  items,  with  each 
of  7  illusions  represented  hy  10  items.  The  14  illusions  in  the  2  forms 
are  all  different,  but  1  form  may  contain  a  major  variant  of  the  other. 
Each  illusion  is  presented  as  a  separate  section,  including  all  the  items 
for  that  illusion.  In  figure  17.10  are  shown  items  representative  of  sev¬ 
eral  of  the  illusions.  For  illustrative  purposes,  the  items  shown  arc  those 
in  which  the  indicated  dimensions  are  equal. 

Each  set  of  items  constitutes  a  scries,  established  as  satisfactory  by 
pretesting.  For  example,  for  the  pan  illusion  illustrated  in  the  panel  I 
of  figure  17.10,  there  are  two  standard  lengths,  I  '/*  ami  2  inches.  For 
variables  there  are  lines  of  the  following  lengths:  V/i,  \%,  1%,  ljdt,  and 
2  inches  for  the  first  standard  ;  and  2,  2  .)/h>,  2  4/16,  2  6/16,  and  2  8/16 
for  the  second  standard.  The  order  of  presentation  of  the  items  is  ran¬ 
domized  with  respect  to  length  of  variables  and  jiosition  (right  or  left) 
on  the  page  of  the  variables. 

(2)  Administration. — The  directions  to  the  tost  are  designed  to  inctil- 
catc  a  set  to  resist  the  illusions,  and.  at  the  same  time,  to  work  very 
rapidly.  Pertinent  parts  of  the  directions  follow: 

This  is  a  test  of  your  ability  to  detect  rapid'}',  merely  b)  insertion,  iltc  true  sires 
of  camouflaged  figures.  *  •  * 

Some  of  these  figures  have  been  drawn  to  took  different  than  they  really  arc.  That 
is.  a  line  in  one  of  these  drawings  may  look  longer  or  shorter  than  a  ruler  would 
sho  e  it  to  be  *  *  * 

Your  task  is  to  make  the  most  accurate  judgiiunu  you  can.  by  attempting  to  ig¬ 
nore  the  extra  lines  and  angles  which  camouflage  the  true  *:re  of  the  variou*  part* 
of  the  figures  *  *  * 

V.'wik  rapidly.  The  use  of  artificial  aids  is  >i<ictly  to  be  at  ■  titled.  As  a  nuttier  of 
fact,  if  you  stop  to  measure  length  of  lints  »i  to  turn  the  figures  around,  fur 
example,  you  will  not  be  able  to  finish  the  test  *  *  * 

"  l>cvrln|.r.|  at  P.y.holoRiral  Research  t’ml  No.  3.  Chxf  conifiSutort:  lift  Joki*  !.  fj. 

\  l.ipniau,  ami  Sjt.  AU»<rt  It.  liattotf  III. 
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To  reinforce  the  set  tor  speed,  the  test  administrator  paces  the  exami¬ 
nees,  announcing  at  the  end  of  each  minute  on  what  item  the  examinee 
should  he  working.  The  examinees  are  required  to  answer  eight  items 
each  minute.  This  pacing  was  established  after  experimental  administra¬ 
tion  of  the  tests  to  S00  unclassified  aviation  students  at  Psychological 
Research  Unit  Xo.  3.  It  permits  approximately  80  percent  to  finish  all 
items  and  100  percent  to  finish  all  hut  two  items  in  a  section. 


1 
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(3)  Scoring. — No  scoring  formula  is  yet  recommended  for  these 
tests.  For  experimental  purposes,  responses  were  categorized  as  indicat¬ 
ing  maximum  resistance  to  illusion  (i.  e..  calling  equal  tines  equal  despite 
illusory  effects),  maximum  susceptibility  to  illusion  (i.  e.,  answering  in 
the  direction  of  the  illusion),  and  intermediate  judgments  (c.  g.,  two 
lines  being  called  equal,  where  maximum  resistance  to  the  illusion  would 
elicit  a  response  of  longer,  and  maximum  susceptibility  a  response  of 
shorter). 

Statistical  results.  ( 1 )  Distribution  statistics. — Available  distribution 
data  arc  shown  in  table  17.19. 


Table  17.19. —  Distribution  data  and  Kuder-Richardson  reliability  for  Objectivity 
of  Perception,  CP806CX1  and  CP806CX2,  based  upon  samples  of  unclassified 

aiiation  students 


N 

Form 

Score 

M 

SD 

‘551 

CP806CX1 

„R», 

25.  S 

6.6 

0.66 

•500 

CP806CX1 

“R” 

22.1 

6.8 

‘551 

CPS06CXI 

21.1 

5.9 

.59 

‘500 

CP806CX1 

"W” 

19.8 

5.8 

*551 

CP806CX1 

<«|*»( 

18.8 

5.9 

.50 

*551 

CP806CX2 

••R” 

34.9 

6.1 

,58 

•500 

CP806CX2 

**R” 

33.8 

6.4 

*551 

CP806CX2 

"W" 

14.2 

4.5 

.46 

‘500 

CP806CX2 

“W" 

14.1 

5.2 

•551 

CP806CX2 

HIM 

15.8 

5.8 

.66 

1  Using  Kudcr-Richardson  formula  No.  21. 

*  Tested  February  through  April  1945,  at  Psychological  Research  Unit  No.  2. 

*  “R”  means  responses  indicating  resistance  to  illusions. 

•Tested  April  and  May  194$  at  Medical  and  Psychological  Examining  Unit  No.  8. 

1  "W"  means  responses  indicating  susceptibility  to  illusions. 

'  “I”  means  intermediate  judgments,  as  defined  in  the  text. 

(2)  Reliability  coefficients. — Although  the  assumptions  underlying  the 
formula  arc  not  completely  satisfied,  Kudcr-Richardson  estimates 
(formula  No.  21)  were  secured  as  preliminary  evidence  concerning  test 
reliability.  The  data  are  presented  in  the  last  column  of  table  17.19. 
Correlations  between  the  two  forms  also  yield  some  indication  of  the  re¬ 
liabilities  of  the  test,  although  the  two  forms  arc  perhaps  not  comparable. 
The  data  are  shown  in  table  17.20. 


Table  17.20. —  Correlations  between  Forms  CP806CX1  and  CP806CX2  of 
Objectivity  of  Perception,  based  upon  a  sample  of  551 
unclassified  aviation  students' 


Variable 

1 

_ 

2 

4 

5 

6 

1.  •H" 

for  CP806CXI  . 

-0.37 

-0.30 

0.12 

-0.18 

-0.30 

2.  "W 

for  CP806CXI  . 

-0.37 

... 

-.06 

-.21 

.44 

-.02 

3.  “1” 

for  CP806CXI  . 

-.30 

-.06 

.  .  • 

-.31 

-.21 

.52 

4.  "R" 

for  CP806CX2  . 

.42 

-.28 

-.3’, 

.  •  . 

-.40 

-.72 

5.  "W 

for  CP806CX2  . . 

-.18 

.44 

-.21 

-.40 

•  •  • 

-.27 

6.  “I" 

for  CP806CX2  . 

-.30 

-.02 

,52 

-.72 

-.27 

*  •  • 

1  Tested  February  through  April  1945  it  Psychological  Research  Unit  No.  2. 


Evaluation. — The  test  appears  to  be  quite  difficult  and  'he  three  scores 
on  each  test  appear  to  have  only  moderate  reliability.  The  moderate  cor¬ 
relations  (0.42  to  0.52)  between  corresponding  scores  on  the  two  forms 
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indicate  either  relative  lack  of  common- factor  variance,  and  thus  a  ten¬ 
dency  towaids  .specificity  of  susceptibility  to  different  illusions,  or  low 
reliability. 


Normality  of  lYrrcptio.i,  C1’806CX3  and  0*8060X4  14 

These  two  tests  utilize  the  same  test  booklets  as  Objectivity  of  Percep¬ 
tion,  CTS0GCX1  and  CPS06CX2,  respectively.  The  sole  difference  lies 
in  the  instructions  to  the  test.  In  taking  an  illusions  test,  the  examinee 
m.»y  adopt  either  a  naive,  phenomenological  approach,  or  an  approach 
characterized  by  the  intent  to  see  objects  as  they  are  physically.  It  was 
decided  to  investigate  these  two  approaches,  to  discover  what  differences 
in  test  performance  might  be  attributable  thereto,  and  which  mental  set, 
if  either,  produced  the  more  reliable  and  valid  results. 

(1)  elthi.iiiish  a'ioit. --Tha  instructions  to  these  tests  attempt  to  in¬ 
culcate  a  n.iivc,  nonresistant  set.  Pertinent  extracts  from  the  directions 
follow : 

In  flits  test  you  will  be  asked  to  rcjiori  things  exactly  as  you  sec  them  *  *  * 
.Many  figures  have  hem  drawn  dchlrcrately  to  look  different  titan  they  actually  are. 
That  is.  a  lim  in  one  of  these  dt  a  wings  may  look  longer  or  shorter  than  a  ruler  would 
prove  it  to  be  *  *  *  Your  task  is  to  report  how  the  figure  looks  to  you,  not  bow 
you  think  it  should  look  *  *  *  You  will  get  the  best  secrc  by  working  rapidly 
and  recording  your  first  impression  of  bow  the  figures  actually  look  to  you. 

Statistical  results.  (1)  Distribution  statistics. — Distribution  data  arc 
shown  in  tabic  17.21. 


Taiue  \72\-- Distribution  data  and  Kuaer- Richardson  reliabilities  of  Normality 
of  i'ereeption,  (ISVdCXJ  and  C1'806CX4,  based  upon  a  sample  uf  518  unclassified 

ai-iaiion  student r* 


Komi 

Score* 

M 

SO 

CPHOftCXl  .  .. 

R" 

21.3 

S.9 

0  60 

(  I'MOf.CX)  . 

•  w* 

25.1 

6.8 

.67 

n-x  iM  xi  . 

‘•r 

16.6 

5.2 

.55 

(  I’M  'Ot'X4  . 

*KM 

31.3 

5  2 

CI'KOt,CX«  . 

•w 

18.2 

SI 

.50 

Cl’SucCXf  . 

II^M 

14.6 

5.4 

.62 

*  Toted  iii  February  ;nJ  March  19 tS  at  Psychological  Research  Unit  No.  2. 

•  Pur  riianii’K  of  'cores,  see  tootnoles  to  table  17.19, 

1  Untig  Kiiiler-Ric'.iaiojon  formula  No.  21. 


(2)  Reliability  coefficients. — Kuder-Kichardson  reliabilities  were  com¬ 
puted  for  these  forms  also,  and  are  shown  in  the  last  column  of  table 
17.21.  Again,  correlations  between  die  forms  yield  some  indication  of 
test  reliability  and  they  arc  presented  in  tabic  17  22. 

’•  Develop*.1  at  Psychological  Research  Una  No.  2.  Chief  contributor:  Cajrt.  John  I.  Lacey. 
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Table  17.22. —  Correlations  between  Forms  CP806CX3  and  CP806CX4  of  Nor¬ 
mality  of  Perception  based  upon  a  sample  of  518  unclassified  aviation  student/ 


Variable 

1 

2 

3 

4 

5 

6 

1.  “R”  for  CP806CX3  . 

-0.32 

-0.12 

0.32 

-0.19 

-0.13 

2.  "W  for  CP806CX3  . 

-0.32 

-.27 

-.24 

.51 

-.19 

J.  •T’  for  CP806CX3  . 

-.12 

-.27 

... 

-.15 

-.31 

.46 

4.  “R”  for  CPS06CX4  . 

.32 

-.24 

-.15 

... 

-.33 

-.49 

5,  “W”  for  CP806CX4  . 

-.19 

.51 

-.31 

-.33 

•  *  * 

.51 

6.  "I"  for  CP806CX4  . 

-.13 

-.19 

.46 

-.49 

-.51 

•  *  * 

1  Tested  in  February  and  March  1945  at  Psychological  Research  Unit  tto,  2. 


Evaluation. — The  illusions  tests  administered  with  instructions  in¬ 
tended  to  produce  a  naive,  phenomenological  set  do  not  seem  to  differ 
much  from  the  previous  forms.  Reliabilities  and  score  intcrcorrelations 
seem  fairly  comparable.  Administering  the  tests  with  normality  instruc¬ 
tions  rather  than  objectivity  instructions  results  in  fewer  judgments  in¬ 
dicating  resistance  to  illusions  and  more  judgments  indicating  suscepti¬ 
bility  to  illusions  (compare  tables  17.19  and  17.21),  but  the  differences 
are  quite  small. 

Objectivity  of  Perception,  CP806BX1,  and  Normality  of  Percep¬ 
tion,  CP806BX3  »• 

These  tests  are  comparable  in  arrangement,  format,  and  directions  to 
the  previous  forms.  They  differ,  however,  in  content,  being  devoted  to 
illusory  effects  stressed  by  Gestalt  psychologists,  centering  about  the  con¬ 
cepts  of  structure-  and  articulation  of  parts.  The  two  sets  of  instructions 
are  entirely  comparable  to  those  used  in  the  previous  tests. 

The  test  booklet  is  divided  into  10  parts,  comprising  125  items.  Sam¬ 
ple  items  of  three  different  sections  are  shown  in  figure  17.11.  For  il¬ 
lustrative  purposes,  the  correct  answers  to  the  items  chosen  for  repro¬ 
duction  arc  all  "equal.”  The  nature  of  the  test  may  be  made  clear  in  the 
following  listing  which  gives  the  phenomenon  of  perception  with  which 
several  of  the  parts  are  concerned. 

Part  I:  Parts  of  well -structured  objects  appear  less  numerous  than 
irregularly  spaced  parts  of  non-structurcd  groups  (see  upper  panel  of 
fig.  17.11). 

Part  II :  Parts  of  homogeneous  regular  objects  appear  less  numerous 
than  parts  of  broken,  irregular  objects  (see  middle  panel  of  fig.  17.11). 

Part  III:  Articulated  groups  composed  of  identical  objects  (plane 
silhouettes)  appear  less  numerous  than  unorganized  groups  composed  of 
dissimilar  objects  (see  lower  panel  of  fig.  17.11). 

Part  IV :  Triangles  appear  larger  in  area  than  Greek  crosses. 

Part  VII:  Of  two  equilateral  triangles,  one  placed  above  the  other,  the 
lower  one  is  perceived  as  of  lesser  area. 

Part  IX :  A  tilted  square  is  perceived  as  of  greater  area  than  the  same 
square  presented  in  the  horizontal-vertical  orientation. 

Other  parts  constitute  variants  and  combinations  of  these. 

'■*  Developed  *1  r./eholoeical  Unit  No.  3.  Chief  contributor*:  Sat  Stanley  Blumberr.  C*pt. 
John  I.  Lacey,  and  LL  Eli  A.  Lipman. 
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FIGURE  17.11 

SAMPLE  items  of  objectivity  of  perception. 

CP806BXI,  AND  NORMALITY  OF  PERCEPTION 
CP806BX3 

Each  section  of  the  test  is  separately  timed  with  1  minute  and  10  sec¬ 
onds  allotted  to  part  V,  1  minute  and  25  seconds  to  part  IV,  1  minute 
and  30  seconds  to  part  III,  1  minute  and  35  seconds  to  parts  II  and  IX, 
and  1  minute  and  40  seconds  to  parts  I,  VI,  VII,  VIII  and  X. 

These  time  limits  are*  based  upon  preliminary  experimental  adminis¬ 
tration  of  the  test  to  248  unclassified  aviation  students  at  Psychological 
Research  Unit  No.  2. 

No  other  statistical  data  were  available  at  the  time  this  was  written. 

EVALUATION  OF  FORM  PERCEPTION  TESTS 

Too  few  data  are  available  to  permit  a  thorough  evaluation  of  form- 
perception  tests.  The  lack  of  extensive  factorial  data,  in  particular,  obvi¬ 
ates  the  type  of  interpretation  and  evaluation  that  has  proved  to  be  so 
useful  and  fruitful. 
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The  two  pattern-formation  tests  for  which  data  are  available  (Picture 
Integration,  CP104A,  and  Pattern  Assembly,  CP804A)  have  moderate 
pilot  validity,  attributable  in  large  part  to  saturations  in  the  space-rela* 
tions,  perceptual-speed,  visualization,  and  length-estimation  factors.  It  is 
noteworthy  that  for  both  these  tests  there  are  sizeable  discrepancies  be¬ 
tween  communalitics  and  reliabilities.  It  may  be  that  at  least  some  of  this 
disparity  is  due  to  a  new  factor  of  perceptual  integration.  While  there  are 
no  data  to  support  this  speculation,  the  problem  is  worthy  of  study. 

The  two  pattern-completion  tests  (Mutilated  Words,  CP512A,  and 
Object  Completion,  CP811A)  apparently  have  little  or  no  pilot  validity, 
which  argues  for  a  lack  of  the  factors  known  to  be  valid  for  pilots.  The 
additional  fact  that  very  similar  tests  were  found  by  Thurstone  (2)  to 
have  prominent  loadings  in  a  factor  tentatively  identified  as  "speed  and 
strength  of  closure"  implies  that  this  factor  lacks  pilot  validity. 

The  pattern-analysis  tests  promise  only  low  to  moderate  pilot  validity. 
The  value  of  the  illusions  tests  is  yet  unexplored.  Available  data  indicate 
relative  specificity  of  the  various  illusions,  although  Thurstone  (2)  has 
tentatively  established  the  existence  of  a  factor  common  to  different  geo¬ 
metric  illusions. 
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Size  and  Qistance  Estimation  Tests1 


INTRODUCTION 


Contents  of  the  Chapter 

This  chapter  considers  tests  that  measure  the  ability  to  perceive  size 
and  distance  accurately.  For  convenience,  not  because  of  a  considered 
psychological  analysis,  the  tests  are  divided  into  three  categories,  to  each 
of  which  a  section  is  devoted.  The  categories  are:  (1)  Distance  Judg¬ 
ment  Tests,  (2)  Angular  Judgment  Tests,  and  (3)  Judgment  of  Propor¬ 
tions  Test.  Following  the  discussion  of  the  tests,  a  general  evaluation  of 
the  area  is  presented. 

Rationale  for  Development  of  Size  and  Distance  Estimation  Testa 

Piloting  a  plane,  navigating  a  plane,  sighting  on  a  target — all  these  ac¬ 
tivities  seem  to  involve  accurate  perception  of  size  and  distance.  Early 
job-analysis  data  provided  some  indication  of  the  importance  of  this 
perceptual  activity  in  each  of  the  three  main  air  crew  positions. 

In  table  1.5,  it  may  be  Seen  that  estimation  of  speed  and  distance  was 
mentioned  in  from  27  percent  to  3»  percent  of  eliminations  from  ele¬ 
mentary  and  advanced  training,  and  in  1  percent  (in  one  sample)  and 
5  percent  (in  another  sample)  of  operational  reclassifications. 

When  combat  supervisors  rated  20  traits  for  their  importance  in  com¬ 
bat,  on  a  9-point  scale  in  which  the  nominal  figure  5  was  taken  to  mean 
“better  than  average,”  estimation  of  sjiecd  and  distance  was  rated  5.8 
for  combat  bombardiers  (see  table  1.2),  6.6  for  combat  navigators  (see 
table  1.4),  7.5  for  combat  fighter  pilots,  and  6.1  for  comkat  bomlier  pilots 
(sec  table  1.6).  * 

Late  in  the  program  (February  19I4),  an  intensive  job  analysis  of 
the  act  of  landing  a  plane  was  undertaken  by  Psychological  Research 
Unit  No.  3,  which  supplied  some  very  specific  information  on  the  role  of 
distance  perception.  Four  aviation  psychologists  spent  V/i  weeks  at  a 
primary  school  collecting  data  on  individually  observed  landings,  inter¬ 
viewing  students  and  instructors,  and  themselves  experiencing  landings. 
The  main  impetus  for  this  research  wa»  provided  by  the  fact  that  a  large 
number  of  eliminations  from  chmcn'.ary  training  occurred  during  the 
period  of  concentrated  practice  on  landings. 

1  Written  by  Capt  Jpbn  I.  l-acry  ami  Trth  /Sgl.  UrtlJ  If  Sbulcy. 
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While  the  job-analysis  data  cannot  be  presented  in  detail  here,  some 
description  of  the  role  of  distance  estimation  may  be  given.  In  turning 
into  his  approach  to  the  landing  lane,  the  pilot  must  accurately  judge 
distance  so  that  he  may  cut  the  throttle  and  make  the  gliding  turn  at  the 
correct  time.  As  the  airpianc  nears  the  ground,  the  pilot  causes  the  plane 
to  lose  air  speed  until  it  "stalls  out"  and  drops  to  the  ground.  A  perfect 
landing  can  occur  only  when  the  pilot  accurately  judges  hij  height  off  the 
ground.  On  the  ground,  of  course,  the  pilot  must  estimate  the  distance 
of  other  objects  from  his  own  plane. 

Data  such  as  these  more  than  justify  the  development  of  tests  of  dis¬ 
tance  estimation,  when  the  distances  involved  arc  relatively  great.  Im¬ 
plicit  in  the  development  of  printed  tests  of  length  estimation,  however, 
is  the  assumption  of  1  factor  of  length  estimation  common  to  a  wide 
range  of  distances.  A  similar  assumption  is  made  in  the  development  of 
printed  tests  of  judgment  of  angular  magnitudes. 

DISTANCE  JUDGMENT  TESTS 

In  this  section  will  be  described  those  tests  that  measure  the  ability  to 
estimate  linear  and  nonlinear  extents. 

Shorter  Line,  CP606 

Ea  .1  the  classification  program,  the  best  available  information  con¬ 
cerning  the  duties  of  pilots  and  navigators  indicated  that  tests  sampling 
the  abilities  to  estinr-P.  size  and  distance,  to  make  quick  and  accurate 
approximate  readings  from  tables,  graphs,  and  meters,  to  apprehend 
quickly  number-size,  etc.,  had  promise  of  validity.  These  diverse  activi¬ 
ties  were  grouped  together  under  the  term  '‘Quantitative  Perception,"  and 
the  OHicc  of  the  Air  Surgeon,  Psychological  Branch,  requested  the  Co¬ 
operative  Test  Service  to  supply  a  series  of  such  tests.  In  this  chapter, 
three  of  these  tests,  Shorter  Line,  CP606,  Nearest  Point,  CP6G7,  and 
Shortest  Path,  CP608,  arc  discussed.* 

Shorter  Line,  CP606,  is  part  VI  of  the  Quantitative  Perception  Test. 
This  test  was  used  in  the  classification  battery  of  April  1942. 

Description . — In  this  test,  the  examinee  is  required  to  indicate  which 
of  two  lines  is  the  shorter. 

(1)  Internal  characteristics. — The  test  comprises  30  items  which  arc 
printed  on  one-half  of  one  side  of  an  IBM  answer  sheet.  Each  item  con¬ 
sists  of  five  straight  lines  of  different  lengths  radiating  from  a  central 
point.  Distracting  lines,  curves,  and  figures  are  added.  Two  of  the  lines 
are  labeled  with  the  letter-symbols  A  and  B.  The  task  of  the  examinee 
is  to  select  the  shorter  line  of  the  two  that  are  labeled,  and  to  record  his 
answer  in  the  space  to  the  right  of  the  diagram.  Two  examples  of  items 
are  shown  in  figure  18.1.  , 

*  for  i  ditcimioii  of  Ok  oilier  jart»  of  Ok  Quantitative  Perception  Test,  see  ch.  16. 
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FIGURE  18.1 

SAMPLE  ITEMS  OF  SHORTER  LINE.  CP608 

(2)  Administration. — Administrative  directions,  greatly  simplified  by 
having  the  test  items  on  the  answer  sheet,  are  quite  short.  They  state : 

In  each  of  the  figures  below,  there  arc  five  straight  lines  radiating  from  a  central 
point.  One  line  is  labeled  a  and  another  b.  Decide  which  of  these  two  lines  is  shorter, 
and  blacken  the  appropriate  answer-space  below  at  the  right  of  the  figure.  Pay  no 
attention  to  any  lines  through  or  near  the  five  "spokes"  radiating  from  the  central 
point . .  .  You  will  have  2  minutes  for  this  part . . . 

(3)  Scoring. — The  scoring  formula  used  in  the  classification  battery 
was  R— 3W. 

Statistical  results.  (1)  Distribution  statistics. — Typical  examples  of 
distribution  statistics  are  given  in  table  18.1.  The  distribution  curves  arc 
approximately  symmetrical. 


Table  18.1. —  Distribution  constants  for  Shorter  Line,  CP606 


Croup 

N 

M 

SD 

Unclassified  aviation  students'  . 

240 

22.1 

J.9 

Unclassified  aviation  students'  . 

527 

21.6 

6.1 

Navigators'  . . 

392 

21.9 

6.1 

'Testing  dates  and  units  unidentified. 

*  In  Hondo  classes  42-11  to  42-16.  Tested  at  Psychological  Research  Unit  No.  2. 


(2)  Factotial  composition. — The  most  significant  loading  (0.44)  is 
in  tlie  length-estimation  factor  in  which  the  test  is  almost  pure.  This 
loading  is  the  weighted  average  of  the  loadings  in  two  analyses.  The 
communality  was  found  to  be  very  low  (0.27).  For  a  fuller  picture  of 
the  factorial  composition  of  this  test,  sec  appendix  B. 

(3)  Test  validity. — Validation  results  based  on  several  samples  are 
given  in  table  18.2. 

Evaluation.— Shorter  Line  shows  only  moderate  pilot  and  navigator 
validity.  There  are  no  data  concerning  the  reliability  of  the  test,  but  the 
evidence  of  the  low  coefficients  for  Nearest  Point,  CP607,  and  Shortest 
Path,  CP608,  suggest  that  it  too  is  quite  low. 

Factor  analyses  of  this  test  show  that  27  percent  of  the  total  variance 
has  been  accounted  for  by  common  factors.  Of  this,  only  the  length-esti¬ 
mation  factor  accounts  for  a  significant  amount  (19  percent)  of  the 
total  variance  of  the  test.  The  remaining  8  percent  is  accounted  for  by 
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factors  on  which  the  loadings  arc  quite  low.  While  Pattern  Assembly, 
CP804A  (see  ch.  17),  has  a  higher  loading  on  the  length-estimation  fac¬ 
tor,  Shorter  Line  is  a  purer  measure  of  this  factor.  The  loading  is  suffi¬ 
ciently  high  to  suggest  that,  with  an  increased  reliability,  the  test  may 
find  value  in  future  factor  research  as  well  as  in  selection  where  this 
factor  is  valid.  It  is  likely  that  the  distracting  lines  merely  serve  to  create 
illusions  and  to  render  the  test  more  ambiguous  factorially.  In  a  later 
test  of  the  same  kind,  distractors  were  omitted. 

An  estimate  of  the  pilot  validity  of  this  test,  made  from  factorial  in¬ 
formation  (see  ch.  23)  is  0.11,  which  is  somewhat  short  of  the  empirical 
validity.  This  discrepancy  may  be  large  enough  to  indicate  some  validity 
in  other  factors  that  were  not  included  in  this  estimate.  Other  possible 
explanations  of  the  discrepancy  would  be  underestimation  of  the  length- 
estimation  factor  loading  in  the  test  C1'606,  or  in  the  pilot  criterion,  or 
in  both. 

Nearest  Point,  CP607 

This  is  part  VII  of  the  Quantitative  Perception  Test.  This  test  was  also 
in  the  classification  battery  of  April  1942. 

Description. — Like  Shorter  Line,  this  test  is  printed  on  one-half  of  one 
side  of  an  IBM  answer  sheet  and  is  so  designed  that  there  arc  but  two 
alternative  answers. 

(1)  Internal  characteristics. — Each  item  consists  of  five  dots  irregu¬ 
larly  scattered  around  a  reference  point,  which  is  in  the  form  of  a  dot 
with  a  small  circle  around  it.  Two  of  the  dots  arc  labeled  a  and  b.  The 
task  of  the  examinee  is  to  select,  from  the  labeled  dots,  the  one  that  is 
nearer  the  reference  point.  In  many  of  the  items,  lines,  curves,  and  fig¬ 
ures  are  drawn  in  and  around  the  pattern  formed  by  the  dots.  There  are 
30  items  in  the  test.  Two  illustrative  items  arc  shown  in  figure  18.2. 


FIGURE  18.2 

SAMPLE  ITEMS  OF  NEAREST  POINT  CP607 


(2)  Administration. — One  completed  sample  item  is  given  in  the  di¬ 
rections  in  explaining  the  test.  Two  minutes  arc  allowed  for  testing  time. 
The  directions  instruct  the  examinee  to  work  carefully,  and  that  each 
error  will  result  in  a  deduction  of  three  points  from  the  score. 

(3)  Scoring. — The  scoring  formula  is  R— 3\V. 
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Statistical  results.  ( ! )  Distribution  statistics. — Typical  examples  of 
distribution  statistics  obtained  on  this  test  arc  given  in  table  18.3.  The 
distribution  curves  are  approximately  symmetrical. 


Tahi.k  18  3. —Distribution  constants  for  Nearest  Point,  CP6Q7 


N 

M 

527 

24) 

15.9 
17  0 

392 

17.6 

■TfM.iig  (l)tcj  anti  u:\itt  unidentified. 

’  lu  Hondo  tli'Ki  42  11  to  42-16.  Toted  at  I’jychological  Research  Unit  No.  2. 


(2)  Reliability  coefficient. — By  the  odd-even  method  an  estimated  reli¬ 
ability  coefficient  of  0.87,  corrected  for  length,  was  obtained.  This  figure 
is  based  on  a  sample  of  243  unclassified  aviation  students.  Since  the  test 
is  speeded,  this  figure  is  an  overestimation.  Because  the  scoring  formula 
(R— 3\V),  however,  emphasizes  errors,  the  overestimation  may  not  be 
serious. 

(3)  Factorial  composition. — The  most  significant  weighted-average 
loadings  arc  in  the  length-estimation  (0.43),  general-reasoning  (0.21), 
and  visual-memory  (0.23)  factors.  The  loading  in  general  reasoning  is 
suspect.  The  comnmnr.lity  is  0.33.  For  a  fuller  picture  of  the  factorial 
composition  of  this  test,  sec  appendix  B. 

(4)  Test  validity. — Validation  results  based  on  several  samples  are 
given  in  tabic  18.4. 

Evaluation. — This  extremely  simple  test  has  good  pilot  validity  (aver¬ 
age  coefficient  approximately  0.19  based  upon  4,045  cases).  Since  the 
test  is  only  2  minutes  iong,  however,  its  reliability  is  low.  Lengthening 
would  probably  improve  both  reliability  and  validity  at  low  cost  in  time. 
The  estimated  validity,  utilizing  factorial  information  (see  ch.  28),  is 
0.14,  which  »:'»  a  hide  snort  of  the  empirical  validity. 

In  the  factor  analysis  of  tins  test,  38  percent  of  the  total  variance  was 
accounted  f  >r  by  common  factors.  The  length-estimation  factor  accounts 
for  18  percent,  the  genenrf-reasoning  factor  for  4  percent,  and  the  visual- 
memory  factor  :cr  5  percent  of  the  total  variance.  The  remaining  11 
percent  is  accounted  for  oy  factors  in  which  the  loadings  are  quite  low. 

Shortest  Path,  CP603 

This  is  part  Vlfi  of  the  Quantitative  Perception  Test,  also  included 
in  the  classification  battery  of  April  1942. 

Description. —  The  test  consists  of  30  items  printed  on  one-half  of  one 
side  of  an  IBM  answer  sheet. 

(1)  Internal  characteristics. — Each  item  consists  of  two  points,  la¬ 
beled  P  and  Q,  placed  about  l'/j  inches  apart.  Three  curved  or  angular 
lines,  lalxled  A,  B,  and  v,,  are  drawn  between  these  two  points.  The  task 
of  the  examinee  is  to  select  the  path,  or  lie.-,  between  the  two  points  that 
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is  shortest.  The  examinee  recoids  his  answer  directly  below  the  figure. 
As  illustrations,  figure  iXj  pn-ents  two  items  of  the  test. 


FIGURE  18.3 

SAMPLE  ITEMS  OF  SHORTEST  PATH,  CP608 


(2)  Administration. —  One  completed  problem  is  shown  next  to  the  di¬ 
rections.  The  testing  time  is  2  minutes.  The  examinee  is  warned  to  work 
carefully,  since  a  heavy  penalty  (three  points)  is  given  to  errors. 

|  (3)  Scoring. — The  scoring  formula  is  R— 3W. 

|  Statistical  results,  (i)  Distribution  statistics. — Typical  examples  of 

distribution  statistics  obtained  on  this  test  are  given  in  table  18.5.  The 
distribution  curves  are  approximately  symmetrical. 

Table  18.5. —  Distribution  constants  for  Shortest  Path,  CP608 


1  Testing  datca  and  inuh  unidentified. 

T  In  Hondo  classes  42-11  to  42-16.  Tested  at  Psychological  Research  Unit  No.  2. 


(2)  Reliability  coefficient. — By  the  odd-even  method,  an  estimated  re¬ 
liability  coefficient  <.f  0.09,  corrected  for  length,  was  obtained,  based  on 
a  sample  of  233  unclassified  aviation  students.  This  figure  is  an  over- 
estimation,  since  the  test  is  highly  speeded.  The  scoring  formula  weights 
errors  heavily,  however,  so  the  overestimation  may  not  be  serious. 

(3)  Factorial  composition. — The  most  significant  loadings  are  in  the 
length-estimation  (0.-16),  spatial-relations  (0.32),  visualization  (0.28), 
and  perceptual -speed  (0.25)  factors.  The  communality  is  0.52.  For  a 
fuller  picture  of  the  factorial  composition  of  this  test,  see  appendix  B. 

(4)  Test  validity. — Validation  results  based  on  several  samples  are 
given  in  table  18.6. 

Evaluation. — This  test  has  good  validity  for  pilots  (average  validity 
ci k  tficii  lit  approximately  0.25)  and  navigators  (average  validity  coefficient 
approximately  0.2S).  Its  validity  for  bombardier  is  difficult  to  evaluate 
because  of  the  low  reliability  of  circular  error  as  a  measure  of  proficiency 
for  that  air  crew  position. 
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Factor-analysis  data  for  this  test  indicate  that  52  percent  of  the  total 
variance  has  been  accounted  for  hy  common  factors.  Of  this,  the  per¬ 
ceptual-speed  factor  accounts  for  5  percent,  the  visualization  factor  8 
percent,  the  spatial-relations  factor  10  percent,  and  the*  length-estimation 
factor  21  percent  of  the  total  variance.  The  remaining  8  percent  is  ac¬ 
counted  for  by  factors  in  which  the  test  has  very  low  loadings.  It  is  not 
as  pure  a  measure  as  Shorter  Line,  CP606. 

The  estimated  pilot  validity  coefficient  (computed  from  factor  equa¬ 
tions,  see  ch.  2.8)  is  the  same  a-,  that  found  empirically.  This  indicates 
that  all  the  factors  valid  for  pilot  selection  are  accounted  for  in  this  test. 

Map  Distance,  CP626B  * 

The  known  validity  of  Nearest  Point,  CP607,  provided  the  basis  for 
the  construction  of  map  distance.  It  was  desired  to  construct  a  longer 
lest  to  increase  reliability  and  to  add  more  face  validity  to  a  test  of  dis¬ 
tance  estimation.  The  test  was  designed  also  to  utilize  somewhat  longer 
distances  than  did  the  nearest  point  test. 

Description. — The  test  utilizes  four  copies  of  a  given  portion  of  an 
airways  map  (6J/i  inches  x  7'/  inches).  A  copy  of  the  map  appears  on 
each  of  four  pages,  arranged  in  two  double-spreads.  Twelve  towns  on 
each  map  are  indicated  by  a  large  dot  and  a  letter-symbol.  The  12  towns 
indicated  vary  from  map  to  map.  A  reference  point,  identified  by  a  dot 
with  a  circle  around  it,  is  also  placed  on  the  map.  The  position  of  this 
reference  point  also  varies  from  map  to  map.  The  task  of  the  examinee 
is  to  indicate  which  of  two  towns  is  closer  to  the  reference  point.  In 
figure  18  4  is  shown  one  page  of  the  test. 

(1)  Internal  characteristics. — As  described  above,  there  are  48  items, 
divided  into  2  parts,  in  this  test.  A  sample  map  with  three  items  is  used 
at  the  beginning  of  the  test. 

(2)  Administration. — Pertinent  parts  of  the  directions  for  the  test 
are  as  follows: 

This  is  a  lest  of  your  ability  to  estimate  distances  on  maps.  Suppose  that  you  are 
a  pilot  and  your  problem  is  to  judge  which  of  two  towns  is  the  closer.  In  order  to 
determine  this,  you  will  have  to  consult  your  map  and  estimate  the  distance  Pom 
your  plane  to  each  of  the  towns  in  question.  In  this  test,  you  will  be  shown  a  map 
and  asked  to  compare  the  distances  from  a  given  reference  point  to  various  pairs  of 
towns.  Your  task  each  time  will  be  to  choose  the  nearer  town. 

Each  double-spread  is  separately  timed.  Three  minutes  are  allowed 
for  each  part  and  5  minutes  for  the  administration.  For  experimental 
purposes,  however,  5  or  9  minutes  were  occasionally  allowed  for  each 
part. 

(3)  Scoring. — The  scoring  formula  is  R  —  3W+40. 

M>fftof>r<l  11  HoJaui'lui.  AA7  Triininf  Comm* Ad.  Ckicf  conlnbulort:  li»J.  S.  R41M 
W«ll»c«  tnj  SuO  of  i'ciccpuil  t-V«»rck  Unit 
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Statistical  results.  (1)  Distribution  statistics. — Typical  examples  of 
distribution  statistics  arc  given  in  table  18.7.  The  distribution  curves  are 
approximately  symmetrical  and  considerably  Hatter  than  normal. 


Indicate  which  one  or  the  points  in  each 

rAI*  LISTED  BCLOW  IS  NEARER  TO  THE  REFER¬ 
ENCE  POINT  IN  THE  LOWER  RIGHT  -HAND  CORNER 
or  THE  MAR. 


1- 0  OR  I 

2- 0  OR  F 

3- E  or  J 

4- B  OR  0 


5- K  OR  o 

6- J  ORG 

7- F  ORB 

8- 1  OR  K 


9- H  OR  D 

10-  H  OR  f 
It-C  OR  E 
12- B  or  H 


FIGURE  18.4 

SAMPLE  MAP  &  ITEMS  OF  MAP  DISTANCE, 
CP626B 


Table  18.7.-—  Distribution  constants  for  Map  Dis'ance,  CT626B,  based  on  a  nun  pit 

of  315  classified  pitots * 


!  Range  of  ttert* 

Scoring  formula 

M 

SD 

Low 

High 

Highis  only . 

16.1 

47 

2 

11 

R-1W  +  40  . 

22.5 

ii  a 

-20 

11 

1  In  (Iuki  IlK  ami  Art  Ttjttd  at  I’sychologx a|  Kn-arcb  Unit  No.  1. 

(2)  Internal  consistency . — Analysis  of  responses  of  several  sample 
groups  yielded  the  internal-consistency  data  given  in  table  18.8. 
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Ta.UI  1  ^  H  -  -/*:/.  .  /i  l  l  V  i/  i.'  i  fur  I.S  items  of  Milp  Dishutii’,  C 

j,  .f.iii:;.',-  of  mi,  ,’ij ' J  -ill  ion  students 


Scoring  formula 

s 

M) 

sn* 

Low 

R--nvuo  . 

•7  SO 

0  2S 

0  21 

-0.07 

Nighu  oi  \y  . 

•H7 

.IS 

.10 

01 

k  nv  . 

•417 

.32 

. 

.10 

—  .06 

•  Tc  !<•  I  it  l‘  *) •  1. •»!».,  i«  .il  ll<*  -  vr  !i  I  * i » l  No,  3;  ij.it* »  tuutlrniifscil. 

*  Sim t  tc  i*«l  it  !’*>.  hoi-  lv*  c.*rrh  Unit  No.  3. 


(3)  h'cli.i'ility  coefficient.  -One  sample  yielded  the  estimates  of  reli¬ 
ability  given  in  table  18  9. 

T.ellir.  189.-  -  Ilstuuated  reliability  coefficients  (odd -even)  for  Map  Distance, 

C  /Vi’6/;,'  based  upon  samples  of  unclasifted  mia't'n  Students 


N 

Store 

Tu 

ii.i 

k  -  nv  i  :oo 

0  7: 

•ISi 

K-Wt  Cut) 

*  \ 

1  With  a  testing  time  of  10  imr»ut*». 

•  Note  that  this  m  a  !sp**»*r«i  tot,  tut  with  heavy  penalty  for  error*.  0«M  even  reliabilities, 
therefore,  are  sj.ii* loudly  high. 

*  Tevtiivg  ilatei  am!  units  urmlentiUrd. 

(4)  Corrcftilio n  Jv/.v«yh  ri/hts  an  !  ivrontjs. — For  a  sample  of  911 
navigators  (see  table  18.10),  the  correlation  between  rights  and  wrongs 
was  only  -  0.03.  For  a  s  imple  of  23')  pilots  tested  at  Psychological  Re¬ 
search  l  mt  No.  3  in  September,  Oelober  and  November  l‘>  t-t,  the  cor¬ 
relate  ■,  was  -'0.07. 

(5)  i  ['unity  -  I’.asid  upon  item  analysis  of  the  resjronses  of  750  un- 
classitied  aviation  students,  ilie  Is  yielded  a  mean  proportion  of  correct 
respon-e>  of  Or?  corn et*  1  lor  chance,  with  a  range  from  03)0  to  OS1), 
and  a  standard  deviation  of  0  27. 

(Is)  I'ltct'Snol  comfit  s'lton  The  most  significant  weighted  average 
loadings  ale  m  die  \  i-e  dv at *on  (0  3. '1  and  leiu.th-estmiation  (0.30) 
factors.  The  comnunubty  is  0  35.  which  is  far  below  the  test's  probable 
reliability,  For  a  full  picture  of  the  faetini.il  composition  of  this  test, 
see  apjH  diN  11. 

(7)  /  C't  ti.’/.a  Lty  \  ill  li  'lilts  ba-«d  <ai  sevctal  samples  are 

given  in  table  18.10. 

•  •s  >  Item  -t\  \  ahdaiion  i.f  itinis  tor  a  saiuph*  of  3/(>  pilots  m 

classes  *14 1*  and  4  III,  tested  at  psychological  Research  f’liit  No.  3, 
yielded  a  nu  an  phi  of  (>05,  with  n  range  from  -0.(3  to  i  0 2>)  and  a 
stand. ud  deviation  of  0(0  In  tins  sample,  SR  jHTCvnt  was  graduates. 

A  study  hi  scorir.  hr«>m  the  foiniula  for  the  correlation  of  sums  it 
i  n  U-  i  i--.h  -iiii  1 1  it  e  nk-d  ibaiigis  ni  the  nb.d>iiitv  of  a  test  may 
occur  with  me  I'-gan  :.t  of  varying  relative  weights  to  right  and  wrong 
answers  Similar  changes  can  l*c  brought  al«iut  m  validity  c«H:theiciitv 
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These  modifications  in  reliability  and  validity  coefficients  are  most 
marked  when  light  and  wrong  answers  measure  different  functions,  i.  c., 
when  the  correlation  between  rights  and  wrongs  is  low. 

Data  available  for  the  Map  Distance  test  provide  a  dramatic  example 
of  changing  factorial  composition  of  a  test  score  with  changing  emphasis 
on  right  and  wrong  answers. 

The  problem  was  set  by  the  results  of  an  internal-consistency  item 
analysis  against  the  criterion  of  total  score,  computed  by  the  formula 
R— 3W+40.  Utilizing  the  highest  27  percent  and  the  lowest  27  percent 
of  a  group  of  417  unclassified  aviation  students,  the  mean  phi  based  on 
total  group  was  0.10,  whereas  the  mean  phi  based  on  total  answered 
was  0.32.  These  results  were  completely  the  reverse  of  those  to  be  antici¬ 
pated  for  a  speeded  test.  In  a  true  speed  test  where  everyone  attempting 
an  item  responds  correctly  and  a  good  score  depends  merely  on  the  speed 
of  response,  the  phis  based  on  total  answered  would  all  be  zero,  whereas 
the  phis  based  on  total  group  would  regularly  increase  from  some  one 
item  in  the  test  to  the  last  item.  As  the  speed  element  in  a  test  decreases 
in  imjKirtanee,  the  discrepancy  between  phis  computed  on  the  two  bases 
continuously  'h-creases,  until,  with  a  pure  power  test,  it  disappears. 

The  results  of  the  item  analysis  of  the  Map  Distance  test  indicate, 
then,  that  this  test,  scored  to  weight  errors  heavily,  would  be  more  in¬ 
ternally  consistent  if  it  were  administered  as  a  power  test.  Further  in¬ 
vestigation  of  the  problem  was  indicated.  Utilizing  the  same  sample  of 
447  individuals,  a  new  item  analysis  was  performed  against  the  criterion 
of  total  rights  only.  The  results  were  in  accord  with  expectation;  the 
mean  phi  based  on  the  total  group  was  now  0.31,  and  that  based  on  total 
answered  was  0.18.  Apparently,  then,  the  rights  score  is  a  speed  score, 
whereas  the  error  score  is  a  power  score. 

These  results  led  to  a  small  but  revealing  intcrcorrclational  study. 
Using  a  sample  of  315  unclassified  aviation  students  and  scoring  the 
papers  with  two  formulas,  R  and  R  — 3W4-40,  correlations  with  a  se¬ 
lected  group  of  tests  were  computed.  The  tests  were  selected  to  reveal 
possible  changes  in  factorial  composition.  The  data  arc  presented  in 
table  18.11. 


Taint.  18.11. —  Correlations  of  l:io  scores  on  Map  Distance,  CP626B,  with  selected 

lesls 


Test 

Correlations  score 

R 

R— 3W  +  40 

Speed  of  Identification,  CP610A  . 

0.31 

0.04 

Spatial  Orientation  1,  CP50IB  . . . . 

.37 

-.02 

Mechanical  Principles,  CIOOJA  . 

.07 

.25 

Rrnlin*  Comprehension,  CI614G  . . . . 

.20 

.23 

Numerical  ()|>eratiom  <K),  CI70JB  . 

.33 

-.02 

Numerical  Operation*  (11),  CI702B  . .  *r,.. 

.27 

-.01 

Mathematic*  H,  CUOeiC  . . . 

.25 

.23 

General  Information  (X.iv),  CF505D  . 

.30 

.15 

General  Information  (Pilot),  CK50SD  . . . 

.22 

.10 

Complex  Coordination,  CM 701 A  . . . 

.27 

.20 
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The  correlations  for  Sliced  of  ldcntilicatiou  ami  Spatial  Orientation  I 
show  conclusively  that  the  rights  score  would  have  a  significant  loading 
on  the  |x:rcc|>tua!-s|>ecd  factor,  whereas  the  error-weighted  score  wotdd 
not.4  The  large  discrepancy  between  the  correlations  with  Mechanical 
Principles  and  the  small  difference  yielded  by  Reading  Comprehension, 
indicate,  on  the  other  hand,  that  the  error-weighted  score  has  a  visuali¬ 
zation  loading,  while  the  rights  score  docs  not.  Similarly,  the  rights  score 
involves  some  numerical  variance.  There  is  also  some  indication  of  higher 
verbal  and  spatial-relations  loadings  for  the  rights  than  for  the  error- 
weighted  score. 

While  results  showing  such  extreme  differences  may  be  rare,  there  is 
no  doubt  that  rights  and  wrongs  in  all  but  jxiwer  tests  deserve  separate 
consideration  and  study. 

Evaluation. — Factor  analyses  of  this  test  indicate  that  it  is  not  a  par¬ 
ticularly  good  measure  of  any  one  function.  The  loadings  on  the  various 
factors  show  that  only  35  percent  of  the  total  variance  of  this  test  has 
been  accounted  for  by  common  factors.  Of  this,  the  visualization  factor 
accounts  for  14  percent  and  the  length-estimation  factor  9  percent.  The 
remaining  12  percent  is  accounted  for  by  other  factors  on  which  the 
loadings  arc  quite  low.  Other  tests  are  available  that  have  higher  loadings 
on  the  visualization  and  length-estimation  factors. 

1  The  estimated  pilot  validity,  computed  from  known  factor  validities, 
(0.22),  is  approximately  equal  to  the  average  empirical  coefficient  (0.20). 
This  indicates  that  all  the  factors  in  this  test  that  arc  valid  for  pilot  train¬ 
ing  have  been  accounted  for.  Since  only  al  it  one-half  of  the  nonerror 
variance  has  been  accounted  for,  it  is  clear  tliat  additional  factors  may 
yet  be  brought  out  by  further  analysis  of  this  test.  If  more  complete 
analysis  uncovers  another  factor,  or  factors,  with  significant  loadings, 
further  refinements  may  be  worth  while,  particularly  in  view  of  the 
fairly  high  navigator  validity. 

Path  Length,  CP628  * 

Just  as  Map  Distance,  CP026,  was  constructed  to  provide  a  test  com¬ 
parable  to  Nearest  Point,  CPG07,  but  with  greater  face  validity  and  reli¬ 
ability,  so  Path  Length,  CP628,  was  constructed  to  provide  a  modified 
form  of  Shortest  Path.  CPG08.  The  test  was  not  included  in  any  classifi¬ 
cation  battery,  but  was  used  for  experimental  analysis  only. 

Description. — Portions  of  maps,  presented  in  halftones,  are  shown, 
which  incluJc  a  reference  point  indicated  by  a  dot  with  a  circle  around 
it  and  three  other  points  marked  A,  B,  and  C.  Each  of  the  three  point* 
is  connected  to  the  reference  point  by  s  heavy  black  line,  which  may  fol¬ 
low  the  wandering  course  of  a  river,  a  path,  or  a  road.  The  task  of  She 

*  This  interpretation*  and  tho*e  to  follow,  will  Utomt  clearer  upon  Ike  reader 'a  famtlunta- 
(ion  with  (He  content*  of  ck.  '8,  A  Factorial  Ihcturc  of  Te%(«  and  Criteria. 

1  Developed  it  Headquarter  a,  AAF  Traintnc  Command.  CFirf  contributors:  Maj.  S.  Ratal 
Wallace  and  Staff  of  the  I’erceptual  Rcsea/ck  l* mi. 
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examinee  is  to  select  the  lettered  point  that  is  connected  to  the  reference 
j>oint  by  the  shortest  path. 

(1)  Internal  characteristics. — The  test  consists  of  34  items,  the  first  2 
of  which  arc  used  as  practice  problems.  The  maps  vary  from  item  to 
item  in  size  and  complexity.  A  sample  problem  is  shown  in  figure  18.5. 


FIGURE  18.5 

SAMPLE  MAP  OF  PATH  LENGTH,  CP628 

(2)  Administration. — The  examinees  arc  told  that  this  is  a  test  of 
their  ability  to  estimate  lengths  of  various  paths  on  a  map.  It  is  strongly 
emphasized  that  they  are  to  compare  lengths  of  the  paths  indicated  and 
nut  straight  line  distances  between  the  jxiints.  The  total  testing  time  is 
approximately  8  minutes,  with  5  minutes  allowed  for  the  selection  and 
marking  of  answers. 

(3)  Scoring. — The  scoring  fornvJa  is  R— VV/2. 

Statistical  results.  (1)  Distribution  statistics. — Typical  examples  of 
distribution  statistics  arc  given  in  table  18.12. 


TABi  r  IX  12.--  Distribution  constants  for  Path  Length,  CT628B 


Croup 

N 

M 

SD 

N»*>«»tor»'  . 

192 

19  6 

4.2 

IMoU*  . 

176 

19.4 

4.6 

Oo*  . 

112 

17.1 

5.6 

•  In  llon.lo  *2  II  to  4 2  16.  Tutrd  Jt  l’»»choloKical  Krtcarcb  Unil  No.  2. 

•In  <Ux,  4JK.  Trxtr.l  »t  l*x»(h.>lo»K»l  Rrxorrk  Unil  No.  J. 

•  la  (littti  4 4 D  lift  44C.  »t  l*tjcholw*i<»l  Hoortk  Unh  No.  J. 


(2)  Reliability  coefficient. — The  test  was  administered  in  two  sepa¬ 
rately  timed  halves.  An  estimated  reliability  coefficient  of  025,  corrected 
for  length,  was  obtained.  This  figure  is  based  on  a  sample  of  4  10  un¬ 
classified  aviation  students  (testing  dates  and  units  unidentified). 
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(3)  Factorial  composition.  The  largest  loadings  are  in  the  length- 
estimation  (0.25),  spatial-ie-lalions  (0.23 ),  general-reasoning  (0.21),  and 
equalization  (0.19)  (actors,  Its  commonality  (0.27 )  almost  coincides  with 
its  reliability.  For  a  fuller  picture  of  the  factorial  composition  of  this  test, 
m-c  appendix  I). 

(4)  Test  validity. — A  sample  of  550  pilots,  in  classes  43K,  44A,  4413, 
and  44C,  tested  at  Psychological  Research  Unit  No.  3,  yielded  a  biscrial 
correlation  of  0.22,  corrected  for  restriction  of  range,  between  perform¬ 
ance  in  this  test  and  the  graduation-elimination  criterion  in  primary 
training.  The  mean  score  for  graduates  was  1842,  for  eliminees  16.54, 
and  the  standard  deviation  for  both  combined  was  5.00.  Of  the  sample. 
74  percent  was  graduates.  The  standard  deviation  assumed  for  the  un¬ 
restricted  pilot  staninc  distribution  was  2.00. 

Evaluation. — Factor  analysis  of  this  test  indicates  that  the  nonerror 
variance  has  been  completely  accounted  for  by  the  common  factors.  Four 
percent  of  the  total  variance  is  accounted  for  by  the  general-reasoning  fac¬ 
tor,  4  percent  by  the  visualization  factor,  and  5  percent  each  by  the  spatial- 
relations  and  the  length-estimation  factors.  The  remaining  9  percent  of  the 
total  variance  accounted  for  lies  in  low  loadings  on  other  factors. 

Although  the  validity  of  this  test  for  pilots  is  satisfactory,  its  very  low 
reliability  (0.25)  and  its  factorial  complexity  make  this  test  unsatisfac¬ 
tory. 

Estimation  of  Length,  CP631A* 

This  test  was  also  designed  to  measure  ability  to  make  rapid  judg¬ 
ments  of  line  length,  but  with  much  higher  reliability  than  the  Shorter 
Line  Test,  CP606. 

Description.  (1)  Internal  characteristics. — Near  the  center  of  each 
page  arc  shown  five  bars  of  constant  width  (1.5  nun.)  and  of  standard 
lengths,  arranged  in  order  from  A  to  E.  These  standards  vary  in  length 
from  1.5  to  2.0  centimeters,  in  steps  of  one-tenth  of  a  centimeter.  In  the 
first  part  of  the  test,  the  examinee  is  asked  to  match  the  length  of  each 
of  a  number  of  bars  of  varying  length  with  one  of  these  standard 
lengths.  Each  item  consists  of  a  single  bar  which  is  exactly  the  same 
length  as  one  of  the  five  standards  printed  near  the  middle  of  the  page. 

In  the  second  part  of  the  test  each  item  consists  of  a  bar  which  is 
exactly  double  the  length  of  one  of  the  five  standard  lengths,  the  exami¬ 
nee  is  asked  to  judge  which  of  the  standard  lengths  has  been  doubled 
in  each  item. 

The  standards  appear  vertically  on  the  page,  but  the  variable  test  item* 
arc  placed  at  all  angles.  There  arc  75  items  in  each  part  of  the  test. 

(2)  Administration. — Full  directions  with  one  sample  item  precede  the 
test  proper.  The  total  testing  time  is  approximately  12  minutes,  with  4 
minutes  allowed  for  part  I  and  5  minutes  for  part  II. 

•Developed  at  P>jr.liolocicai  Reiearch  Unit  No.  J.  Chief  contributor*:  CpI.  Albert  A.  Can- 
field  Jr.  and  Lt.  Robert  M.  Ciagnf. 
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(3)  Scoring. — In  cxiK.ritiH.ntal  work  with  this  test  rights  and  wrongs 
were  scored  separately. 

Statistical  results.— The  data  given  below  are  lor  examinees  tested  at 
Psychological  Research  Unit  No.  3,  unless  noted  to  the  contrary. 

(1)  Distribution  statistics.— Distribution  statistics  are  presented  in 
table  18.13.  (Sec  also  table  3.1.) 


Txitifc 


(tight* 

Wrong* 

Right* 

Wrong* 

Right* 

Wrong* 


18.13.-- Distribution  constants  for  Estimation  of  Length,  CP631A,  based 
upon  a  sample  of  771  pilot / 


Score 

Part 

M 

SD 

I  . 

23.3 

8.3 

i  . 

18.4 

8.1 

u  . 

22.1 

7.4 

ii  . 

33.S 

11.0 

I  and  II  . 

45.4 

13.0 

51.9 

16.6 

'  In  clau  441. 


I 


j 

i 

{ 


(2)  Internal  consistency. — The  degree  of  homogeneity  of  the  items  I 
is  indicated  by  a  mean  internal-consistency  phi  of  0.18,  a  standard  devia-  j 
tion  of  the  phi  distribution  of  0.20,  and  a  range  of  values  from  —0.15 

to  0.80.  These  statistics  arc  based  upon  analysis  of  the  responses  of  the  j 
highest  27  percent  and  the  lowest  27  percent  in  total  score  of  a  group  of  J 
750  unclassified  aviation  students  tested  in  May  1944.  Since  the  test  is  i 
highly  speeded,  the  low  mean  value  is  not  unexpected.  , 

(3)  Reliability  coefficient. — Minimum  estimates  of  reliability  were  se-  t 
cured  by  correlating  the  not-quitc-comparable  parts  I  and  II  and  correct¬ 
ing  for  length.  The  data  are  shown  in  table  18.14. 


Tahlk  18.14. —  Reliability  coefficients  estimated  by  correlating  part  I  and  part  ll  for 

Estimation  of  Length,  CP631A 


Group 

Score 

N 

''.I 

*'« 

Right* 

*0  41 

.57 

|)o»  . 

238 

*.40 

.58 

UncluMficd  Aviation  Student*  and  Airplane 

439 

M0 

.57 

i)o»  . . . . . 1 

425 

*.4i 

.60 

Right* 

Wrong* 

586 

586 

M8 

.6$ 

|»0*  . . 

*.S6 

.72 

1  Toted  it  Medical  ami  Psychological  Examining  Unit*  No*.  6  and  8  in  April  1945. 

*  Part  II  admmittered  approximately  4  hour*  alter  part  I. 

1  Part  It  adminulered  immediately  after  part  I. 

*  In  elm  44L 


(4)  Correlations  between  rights  and  vjrongs. — Extremely  interesting 
correlations  between  rights  and  wrongs  were  reported  for  this  test  show¬ 
ing  a  positive  association  between  correct  and  incorrect  responses.  The 
data  are  presented  in  table  18.15. 
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*  '  «W*V 


I 


1 


I 


V 


Table  18.15.—  Correlation  coefficients  between  rights  and  wrongs  for  Estimation  of 

Length,  CP631A 


Group 

N 

r 

Pilot!  in  preflight  training  . 

•431 

Part  I  right!  vs.  part  I  wrong!  .... 

0.10 

Part  I  rights  vi.  part  II  wrongs  .... 

.32 

Part  II  rights  vs.  part  I  wrong!  .. 

.24 

Part  II  rights  vs.  part  II  wrongs  . 

.30 

Unclassified  aviation  student!  . 

•500 

Part  I  rights  vs.  part  I  wrongs  ... 

.21 

Part  II  rights  vs.  part  II  wrongs  .. 

.32 

1  Tested  in  March  1944. 
•Tested  in  May  1944. 


(5)  Difficulty. — Based  upon  item  analysis  of  the  responses  of  750  un¬ 
classified  aviation  students  tested  in  May  1944,  the  test  yielded  a  mean 
proportion  of  correct  responses  of  0.36,  corrected  for  chance,  with  a 
range  from  0.00  to  0.95,  and  a  standard  deviation  of  0.21. 

(6)  Test  validity. — Validation  results  based  on  two  samples  are  given 
in  table  18.16. 


Table  18.16. —  Validity  data  for  Estimation  of  Length,  CP631A,  based  upon  the 
criterion  of  pilot  primary  graduation-elimination 


Score 

*• 

M. 

SD, 

fMl 

.'si.' 

Part  I  rights  . 

•771 

0.85 

23.56 

21.62 

7.95 

0.13 

0.19 

Part  11  rights . . . 

•771 

.85 

22.39 

20.SS 

7.37 

.14 

.18 

Part  I  wrongs  . 

•771 

.85 

18.29 

18.84 

8.34 

-.04 

-.06 

Part  II  wrongs  . 

*771 

.85 

33.72 

32.55 

10.95 

.06 

.06 

Part  I  rights  . 

•431 

.76 

25.02 

23.87 

7.66 

.09 

.13 

Part  II  rights  . 

•431 

.76 

21.24 

20.62 

6.43 

.06 

.09 

Part  I  wrongs  . 

•431 

.76 

20.41 

20.51 

8.99 

-.01 

-.01 

Part  II  wrongs  . 

•431 

.76 

32.79 

33.08 

10.62 

-.02 

.00 

Total  rights  . 

•431 

.76 

46.26 

44.49 

12.14 

.09 

.13 

Total  wrongs  . . 

•431 

.76 

53.20 

53.59 

15.78 

-.02 

-.01 

1  Corrected  to  an  unrestricted  stanine  standard  deviation  of  2.00. 

•  In  class  441.  Tested  March  1944. 

*  In  class  441.  Independent  of  sample  of  771. 


(7)  Item  validity. — Validation  of  items  revealed  a  mean  phi  of  0.03 
based  upon  the  responses  of  400  graduates  and  80  climinccs  from  train¬ 
ing  in  class  441.  The  range  was  from  —0.11  to  4*0.33,  and  the  standard 
deviation  was  0.08. 

Evaluation. — Estimation  of  Length,  CP631A,  seems  to  have  low  to 
moderate  pilot  validity  and  satisfactory  reliability.  The  test,  unfortu¬ 
nately,  has  not  been  submitted  to  factor  analysis,  but  it  promises  to  have 
a  substantial  loading  on  the  length-estimation  factor,  which  is  known  to 
be  valid  for  pilots. 

The  fact  that  the  standard  lengths  are  placed  vertically  on  the  page, 
while  the  variable  lengths  are  at  all  angles,  introduces  the  vertical-hori¬ 
zontal  illusion  into  the  test.  This  is  unfortunate;  future  refinements  of 
the  test  should  be  designed  to  investigate  its  effect. 

The  data  showing  positive  correlations  between  the  number  of  correct 
and  of  incorrect  responses  again  directs  attention  to  the  important  prob¬ 
lem  of  the  roles  of  rights  and  wrongs  in  a  test.  No  a  priori  scoring 
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formula  would  do  justice  to  this  peculiarity  in  this  test.  Apparently,  ex¬ 
aminees  who  get  a  high  rights  score  do  so  by  working  rapidly;  in  so 
doing,  they  also  make  many  mistakes.  Those  who  get  low  rights  scores 
go  slowly  and  cautiously,  and  so  get  low  wrongs  scores  also.  It  may  be 
necessary  to  control  working  time  by  tachistoscopic  exposures  in  order  to 
assure  maximum  univocal  meaning  of  scores  in  this  test  and  in  other 
perceptual  tests  of  its  type.  In  lieu  of  this,  an  empirically  derived  scor¬ 
ing  formula  is  required. 

Distance  Estimation,  CP212A  7 

This  test  is  designed  to  measure  the  ability  to  make  spatial  discrimina¬ 
tions  based  on  the  perception  of  distance.  Other  tests  for  the  perception 
of  extended  distances  or  depth  commonly  involve  the  presentation  of 
stimulus  objects  located  at  20  feet  or  less  from  the  individual  living 
tested.  The  air-crew  member,  however,  has  to  deal  with  distances  in  the 
neighborhood  of  hundreds  of  yards,  for  example,  in  landings  and 
take-offs. 

It  is  known  that  the  cues  that  chiclly  determine  the  perception  of  short 
distances  such  as  those  represented  in  other  depth-perception  tests,  namely, 
the -binocular  cues  of  retinal  disparity  and  convergence,  and  the  monocular 
cue  of  accommodation,  diminish  in  imjiortancc  with  increasing  distance 
and  finally  disappear  at  the  greater  distances.  It  may  lie  argued,  therefore, 
that  other  tests  do  not  measure  the  function  of  distance  jicrception  as  it  is 
defined  in  the  pilot’s  task. 

Distance  Estimation,  CT212A,  is  designed  to  involve  the  perception 
of  extended  distances,  and  the  test  attempts  to  provide  the  visual  stimuli 
for  the  perception  of  such  distances  by  means  of  photographs. 

The  informed  reader  will  note  the  resemblance  of  this  test  to  labora¬ 
tory  investigations  into  size  constancy  phenomena  and  may  question  that 
it  is  primarily  a  test  of  distance  perception.  The  rationale  of  the  test,  as 
outlined  by  those  who  developed  it,  is  as  follows:  the  ability  to  discrimi¬ 
nate  size  when  distances  arc  not  involved  is  relatively  accurate.  On  the 
other  hand,  when  differences  of  distance  are  involved,  the  accuracy  of 
size  perception  is  much  impaired.  It  is  argued,  therefore,  that  errors  in 
the  estimation  of  size  seen  at  a  distance  arc  attributable  more  to  inac¬ 
curacy  in  distance  jH-rceptiou  than  to  size  perception  per  sc. 

Description. — The  test  consists  of  2  sets  of  20  glossy  9-inch  x  12-inch 
photographs,  making  40  items.  In  the  foreground  of  each  arc  15  vertical 
white  stakes  arranged  in  ascending  height  from  left  to  right  with  a  wide 
space  In  the  center.  The  stakes  actually  vary  from  27  to  8.1  inches  in 
height,  differing  from  each  other  by  4  inches.  They  were  photographed 
at  a  distance  of  14  yards  from  the  camera.  In  the  space  Ixlween  the 

*  Originated  •(  ihf  fVrcr|«vtnl  Research  lfo«li|tur(rrs,  ,\.\F  Traminf  fomfnjnd,  ind 

develop!  at  lb*  ISyitudoRH .»!  Tt't  him  Further  on  l*>th  the  predecessor* 

*nd  itKceiMin  to  thi*  te*t  nujf  be  fount!  in  another  report  (report  7)  of  thi  >erici. 
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standard  stakes,  and  at  a  much  greater  distance  from  the  camera,  is  an¬ 
other  stake  which  may  be  one  of  four  heights:  63,  67,  71,  or  75  inches. 
This  stake  may  be  presented  at  one  of  5  distances:  28,  56,  112,  224,  or 
448  yards.  All  stakes  vary  in  width  between  2  and  4  inches.  The  task  of 
the  examinee  is  to  match  the  far,  or  distant,  stake  with  one  of  the  stand¬ 
ards  presented  in  the  foreground.  A  sample  photograph  is  presented  in 
figure  18.6. 

(1)  Administration. — Full,  detailed  instructions  arc  given  with  this 
test,  calling  particular  attention  to  the  fact  that  the  stakes  vary  in  width. 
Eighteen  minutes  are  allowed  for  administration. 

(2)  Scoring. — The  rights  (R),  “one- step”  wrongs  (Wi),  and  “two- 
step”  wrongs  (W*)  arc  scored  and  used  in  the  formula:  3Rf-2W,-f  \V*. 
This  formula  is  not  based  U|>on  statistical  analysis,  but  it  is  designed  to 
credit  incorrect  responses  in  proportion  to  the  degree  of  correctness  they 
represent 

Statistical  results. — The  available  data  arc  for  examinees  tested  at  the 
Psychological  Test  Film  Unit. 

(1)  Distribution  constants. — Distributions  obtained  on  this  test  are 
given  in  table  18.17. 

Table  18.17. —  Distribution  constants  for  Distance  Estimation  Test,  CP212A,  based 
upon  a  sample  of  50  returnee  air-crew  members 


Scoring  formula 

Mean 

SD 

Right! 

6.4 

1.4 

1R  +  1W.  +  W. 

49.1 

16.4 

(2)  Reliability  coefficient. — One  sample  yielded  the  estimates  of  reli¬ 
ability  given  in  table  18.18. 


Table  18.18. —  Reliability  coefficients  (alternate  forms)  for  Distance  Estimation 
Test,  CP212A,  based  upon  a  sample  of  50  returnee  air-crew  members 


Scoring  formula 

''.r 

rn 

Ruck**  . 

0  40 

0.57 

JR  +  1W, +  \V»  .... 

.66 

.79 

I 

(3)  Other  data. — An  experiment  was  performed  to  determine  tin: 
degree  of  corrcspondcnec  between  judgments  made  using  the  plmlogrnphs, 
and  judgments  obtained  in  luc  actual  field  situation  represented  by  tlie 
photographs.  A  comparismi  of  the  judgments  of  13  examinees  (enlisted 
men  and  1  civilian)  in  the  2  situations  reveals  a  high  degree  of  relation¬ 
ship  between  them.  Rank-order  correlations  were  obtained  of  die  judg¬ 
ments  made  in  the  field  situation  and  its  jdiolographic  representation  at 
each  of  five  distances  on  each  of  four  heights  of  test  object.  The  median 
coefficient  is  0.80.  This  is  fairly  good  evidence  that  genuine  distance  per¬ 
ception  can  be  measured  with  photographic  representation  of  the  distance. 
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Evaluation. — This  test  has  a  certain  obvious  value  in  that  it  attempts 
to  measure  distances  thr.t  arc  truly  representative  of  those  that  must  be 
estimated  by  air  crew.  Although  the  evidence  of  correspondence  between 
judgment  of  photographically  represented  distance,  and  distance  judged 
under  field  conditions  is  little  more  than  indicative,  it  is  certainly  suffi* 
cient  to  warrant  further  development  of  the  test. 

ANGULAR  JUDGMENT  TESTS 

The  three  tests  dcscrilicd  in  this  section  al!  attempt  to  assess  the  ability 
of  the  examinee  to  estimate  angular  magnitudes,  an  activity  involved  in 
the  duties  of  all  air-crew  members. 

Angular  Judgment,  CP217A* 

This  test  was  developed  as  a  measure  of  the  ability  to  estimate  angular 
magnitudes. 

Description. — The  test  consists  of  45  items,  in  each  of  which  a  drawn 
angle  is  presented,  followed  by  5  possible  numerical  answers,  one  of 
which  is  correct.  The  angles  range  from  15°  to  330°,  and  they  arc  repre¬ 
sented  by  line  drawings  in  which  is  clearly  indicated,  by  means  of  an 
arrow,  the  angle  to  be  considered. 

( 1 )  Internal  characteristics. — The  directions  present  two  sample  prob¬ 
lems.  Six  illustrative  angles  with  correct  answers  arc  given  along  with 
the  directions,  in  order  to  provide  each  examinee  with  some  frame  of 
reference.  These  illustrations,  however,  are  not  referred  to  during  the 
test.  The  test  is  divided  into  two  parts,  with  23  and  22  items  respectively. 

(2)  Administration. — The  test  was  administered  with  various  time 
limits,  ranging  from  4  to  10  minutes. 

(3)  Scoring. — The  scoring  formula  is  R— W/4. 

Statistical  results. — The  available  data  all  arc  based  upon  examinees 
tested  at  Psychological  Research  Unit  No.  2. 

(1)  Distribution  statistics. — Typical  examples  of  distribution  statistics 
arc  given  in  table  18.19. 


Table  1819.—  Dijlnbution  constants  for  .ln’/ular  Judgment,  CP2I7A,  based  upon 
samples  of  pilots  in  elementary  training* 


N 

CUt| 

M 

SD 

i*t 

«Jt:  *2} 

l«.« 

?.* 

111 

I  » 

II.  J 

It 

ni 

4»E;  4lf 

II  0 

14 

1  Atloami  5  tunulri  pff  part. 

•lie**  timilrntthrd.  himtnm  met c  tf'trd  in  Od«^f  IfO. 


(2)  Reliability  coefficient. — Bv  the  odd -even  method,  an  estimated  re¬ 
liability  cot  fliciont  of  0.87.  corrected  for  length,  was  obtained.  This  figure 
is  based  on  a  sample  of  734  unclassified  aviation  students. 

1  OnfloM  >|  ltr>fjrfk  t'liil  Ne.  1.  Ckul  (ontilbulor,:  C*|H.  pka  T.  Dnkf. 

C»pc  Qtn  tinclk,  TkIl/S|I.  Piat  V  McKrxnoMt.  imi  Tctk./St».  Btnjima  Sti»Wf|. 
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FIGURE  IC6 

SAMPLE  PHOTOGRAPH  Or  DISTANCE 
ESTIMATION,  CP2I2A 


FIGURE  18.7 

SAMPLE  ITEMS  AND  SCALE  OF  ANGLES 
OF  ANGLE  ESTIMATION  CP2I8A 


(3)  Test  validity. — A  sample  of  972  pilots  in  classes  44F  and  441 
yielded  :i  biserial  correlation  of  0.20,  corrected  for  restriction  of  range, 
between  performance  in  this  test  and  the  graduation-elimination  criterion 
in  primary  training.  The  mean  score  for  the  graduates  was  18.11,  for 
diminces  16.73,  and  the  standard  deviation  for  both  combined  was  6.35. 
Of  this  sample  81  percent  were  graduates,  and  the  standard  deviation 
assumed  for  the  unrestricted  pilot  staninc  distribution  was  2.00. 

Evaluation. — This  test  is  moderately  good  as  a  predictive  device  for 
pilot  training  (validity  coefficient  of  0.20).  It  also  has  a  satisfactory  re¬ 
liability  (0.87).  Considering  these  facts  and,  in  addition,  the  simplicity 
of  design  and  administration,  further  analysis  and  development  of  this 
instrument  scents  worth  while.  It  is  possible  that  this  test,  and  the  tests 
Angle  Estimation  and  Landing  Judgment,  to  be  described  next,  will  define 
a  new  factor. 

Angle  Estimation,  CP218A  ' 

This  test  is  designed  to  measure  '  •  ability  to  estimate  the  angle  at 
which  an  object  on  the  ground  is  viewed  from  various  points  above  it  in 
the  air.  From  various  sources,  such  as  job  analyses  and  subjective  judg¬ 
ments,  there  are  indications  that  the  ability  to  estimate  such  angles  may 
be  very  important  in  landing  a  plane.  The  pilot  who  more  readily  and 
accurately  estimates  the  angle  of  his  approach  to  the  landing  strip  should, 
other  things  being  equal,  make  the  better  landing. 

Description.— The  test  is  composed  of  photographs  of  models  of  mili¬ 
tary  vehicles,  such  as  trucks,  jeeps,  and  tanks,  taken  at  angles  ranging 
from  0°  to  90°  between  the  camera  axis  and  the  plane  of  the  ground. 

The  angles  at  which  the  photographs  were  taken  are  in  multiples  of 
10  degrees.  The  task  of  the  examinees  is  to  indicate  the  angle  at  which 
the  photograph  was  taken.  On  each  page  of  the  test  there  is  a  scale  of 
angles  such  as  that  shown  in  figure  18.7.  This  scale  shows  10  different 
angles,  ranging  from  0°  to  90°,  and  each  labeled  with  a  letter  symbol. 
The  examinees  indicate  the  angle  of  the  photograph  by  using  the  appro¬ 
priate  symbol. 

(1)  Internal  characteristics. — There  are  48  items  in  the  test,  including 
2  unscored  practice  problems.  The  test  is  divided  into  two  comparable 
parts.  The  photographs  have  a  common  background  of  a  sandy  surface, 
and  none  has  any  horizon  line,  showing  no  sky. 

(2)  Administration. — The  examinees  are  informed  that  this  is  a  test 
of  their  ability  to  estimate  the  angle  between  the  line  of  sight  and  the 
surface  of  the  ground.  The  scale  of  angles  is  explained  as  follows: 

All  photngraphe  :n  the  te-t  have  been  taken  from  I  of  the  10  anglea  (of  the  veale  of  angle*) 
above,  lettered  A  through  J.  The  angles  are  at  intctval'  of  10  decree*.  It  will  he  your  tat* 
to  vtudy  the  picture  in  each  item  and  eMbuate  the  angle  from  which  it  waa  taken. 

•  Developed  at  Psychological  Research  Unit  No.  1.  Chief  contrihutora:  Sft.  Roy  C.  Andcraon, 
Sufl/Sgt,  Benjamin  Fruchter,  and  f.t.  John  W.  Howe  Jr. 
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The  total  testing  time  is  approximately  IS  minutes,  with  5  minutes  al* 
lowed  for  each  part. 

(3)  Scoring. — In  the  scoring  of  this  test,  correct  answers  originally 
received  +1,  answers  that  were  "one-step’'  wrong  received  0,  and  those 
"morc-than-one-step"  wrong  received  —1.  A  study  was  made  to  deter¬ 
mine  the  scoring  formula  which  would  maximize  reliability,  utilizing  a 
sample  of  736  unclassified  aviation  students.  This  formula  was  found  to 
be  R--2W,  "one-step"  wrongs  being  eliminated  from  consideration. 
With  a  constant  added  to  eliminate  negative  scores,  this  became 
R-2W+60. 

Statistical  results. — The  data  given  below  all  arc  for  examinees  tested 
at  Psychological  Research  Unit  No.  2  in  February  and  March  1945. 

(1)  Distribution  statistics. — Scoring  rights  only,  a  mean  score  of  17.3 
and  a  standard  deviation  of  4.8  was  found  for  a  sample  of  736  unclas¬ 
sified  aviation  students. 

(2)  Internal  consistency. — The  degree  of  homogeneity  of  the  items  of 
the  test  is  indicated  by  a  mean  internal-consistency  phi  of  0.26,  a  stand¬ 
ard  deviation  of  the  phi  distribution  of  0.11,  and  a  range  of  values  from 
0.00  to  0.49.  These  statistics  arc  based  upon  analysis  of  the  responses  of 
the  highest  27  percent  and  the  lowest  27  percent  in  tjtal  score  (rights 
only)  of  a  group  of  736  unclassified  aviation  students. 

(3)  Reliability  coefficient. — The  reliabilities  were  estimated  separately 
for  rights,  wrongs  (morc-than-one-step),  and  wrongs  (one-step).  A 
sample  of  736  unclassified  aviation  students  was  utilized.  Kuder-Richard- 
sun  data  apj>car  in  table  18.20,  and  intercorrelations  among  the  various 
scores  are  presented  in  tabic  18.21. 

Table  18.20. —  Kudcr- Richardson  ( formula  *o.  PI)  relabililits  of  three  stores  for 

/Ingle  Estimation,  CP218A 


Score 

M 

SD 

Total  rights  . . 

17.1 

4.8 

Total  wrongs  (more-than-one-step)  .... 

9.6 

S.l 

Total  wrongs  (one-step)  . . 

20.9 

1.7 

Table  18.21. —  Product-moment  correlations  among  the  various  scores  for  Angle 

fi'tiwition  CP218A 


Score 


1  Rights,  part  I .  . . 

2  Rights,  part  II  . . . 

3  More  thanoneslep  wrongs,  part  I  . . 

4  More-thin-onc-ite p  wrongs,  part  II  . . 

$  One  step  wrongs,  part  I  . . 

6  One-step  wrongs,  part  II  . . . 


I 

n 

n 

4 

5 

8 

m 

019 

-0.62 

-0.45 

-0.38 

O.lo 

Oil 

•  •  ■ 

-.41 

-.61 

.01 

-.18 

-.41 

.5 

-.36 

—.18 

S3 

-.61 

.14 

•  •  • 

-.07 

-.46 

-,J» 

.01 

-.36 

-.07 

*  r 

.10 

-.38 

-.16 

-.46 

1 

fii 

1 

•  •  • 

It  can  be  seen  that  one-step  wrongs  are  highly  unreliable. 
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For  this  sample,  the  maximum-reliability  scoring  formula  would  yield 
a  part  I-part  II  correlation  of  0.58.  Correcting  for  length,  this  becomes 
0.73. 

(4)  Difficulty. — Based  upon  item  analysis  of  the  responses  of  750 
unclassified  aviation  students,  the  test  yielded  a  mean  proportion  of  cor¬ 
rect  resjxmses  of  0.23,  corrected  for  chance,  with  a  range  front  0.05  to 
0.8S  and  a  standard  deviation  of  0.19. 

Evaluation. — This  test,  with  Angular  Judgment,  CP217,  may  measure 
a  new  factor.  The  difficulty  level,  however,  is  unsatisfactory.  The  data 
for  the  reliabilities  of  rights  and  the  two  wrongs  scores  suggest  that  the 
differences  utilized  in  the  photographs  are  not  gross  enough  to  provide 
iV  most  reliable  score.  Some  preliminary  psychophysical  determinations 
are  strongly  indicated. 

landing  Judgment,  CP505B 

This  test  was  designed  to  measure  height  judgment.  Analysis  of  the 
act  of  landing  an  airplane  showed  that  students  lackin';  in  height  judg¬ 
ment  are  bad  risks. 

Description. — Specifically,  the  test  requires  the  examinee  to  learn, 
remember,  and  later  select  the  point  at  which  to  break  the  glide  in  land¬ 
ing.  It  consists  of  photographs  simulating  the  pilot’s  view  of  a  landing 
strip,  road,  or  field  as  he  comes  in  for  a  landing.  Figure  18.8  is  an  ex¬ 
ample  of  the  test  items. 

(1)  Internal  characteristics* — A  period  of  training  precedes  the  test 
proper.  At  the  be  ginning  of  the  test,  the  examinees  are  shown  three 
views  of  a  landing  strip,  each  taken  at  a  different  height.  One,  they  arc 
told,  shows  the  correct  height  (15  feet);  another  is  too  low  (5  to  10 
feet)  ;  and  the  last  too  high  (20  to  25  feet).  The  examinees  arc  to  judge 
each  photograph,  marking  A  if  it  is  the  correct  height  for  leveling  off; 
B  if  it  is  too  low;  and  C  if  it  is  too  high.  The  photographs  are  printed 
each  showing  about  the  same  amount  of  sky  above  the  horizon;  in  this 
way,  one  cue  was  eliminated.  The  cues  contributing  to  correct  judgment 
arc  linear  perspective  and  surface  texture. 

The  examinees  then  study  another  series  of  three  photographs,  this 
time  of  a  grassy  field.  After  administrative  instructions,  they  arc  allowed 
2  minutes  to  restudy  these  first  six  photographs. 

Following  this  reinforcement,  the  examinees  answer  four  sample  prob¬ 
lems.  After  putting  down  their  answers,  the  examinees  are  told  the  cor¬ 
rect  answers. 

The  test  is  divided  into  2  parts  of  26  items  each. 

(2)  Administration. — The  examinees  are  told  that: 

This  is  a  test  of  your  ability  to  learn  and  remember  what  the  ground  looks  like 
from  the  correct  height  for  leveling  oil  during  the  final  part  of  the  glide.  For  this 
test.  15  feet  al*>vc  the  ground  has  l»een  selected  arbitrarily  as  the  correct  height  for 

••  |)errtore<t  at  l*«)rtiotn*>eat  Rucarcb  frit  No.  J.  C>i*f  contributor!:  Pvt.  CkarU*  1C. 
Ftntu.nn,  CpI.  Harul.l  II.  Krltey.  Sufl/Stf.  Wajm*  S.  Zimmerman. 
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leveling  off  .  .  .  The  plioiograplis  have  been  taken  in  such  a  way  that  you  will  not  be 
able  to  'Ictcrini..*.  »;:».•  i.viglit  by  the  aniount  of  sky  in  the  picture  or  the  place  where 
roads  or  fields  arc  cut  off  at  the  edges  of  the  picture.  Instead  you  must  iud  or 
"fed"  the  height  from  the  picture  as  a  whole. 

The  examinees  arc  allowed  to  study  the  picture  giving  the  correct 
heights  and  heights  too  low  and  too  high  during  the  reading  of  the  direc¬ 
tions.  After  the  four  sample  problems  arc  worked,  V/i  minutes  are  al¬ 
lowed  for  the  reexamination  of  these  problems  before  the  test  begins. 
The  total  testing  time  is  approximately  12  minutes,  with  3  minutes  al¬ 
lowed  for  the  first  part,  and  2  minutes  and  45  seconds  for  the  second 
part. 

(3)  Scoring. — The  scoring  formula  is  R— W/2. 

Statistical  results. — N’o  statistics  are  available  for  this  test. 

Evaluation. — This  test  is  obviously  an  attempt  to  duplicate  a  part  of 
the  landing  situation  as  nearly  as  possible  using  paper  and  pencil.  It  seems  j 
probable  that  it  will  have  loadings  on  several  factors.  It  may,  however,  ! 
help  to  define  a  new  factor  held  in  common  with  the  Angle  Estimation  j 
and  Angular  Judgment  tests.  The  approach  underlying  the  construction  of 
this  test,  that  is,  constructing  tests  that  approximate  a  work  sample,  , 
largely  ignores  the  individual  basic  factors  involved  in  the  activity.  This  j 
differs  from  the  approach  stressed  in  the  latter  part  of  the  classification 
program — the  construction  of  pure  tests.  i 

JUDGMENT  OF  PROPORTIONS  TEST 

Only  one  test  in  this  area  was  constructed. 

Judgment  of  Proportions,  CP206B  11 

I 

This  test  is  resigned  to  measure  the  ability  to  recognize  accurately  the 
correct  proj>ortions  of  familiar  objects.  The  rationale  underlying  this 
test  is  that  accuracy  of  object  perception  is  important  for  air-crew  per¬ 
sonnel,  and  that  successful  recognition  of  the  correct  proportions  of 
familiar  objects  reflects  previous  interest,  accuracy  of  perception,  and 
good  visual  memory  for  seen  objects  and  for  spatial  dimensions  in 
general. 

Description. — The  examinee  is  asked  to  select,  from  five  simple  out¬ 
line  diagrams,  the  one  that  has  the  correct  proportions  of  a  given  famil¬ 
iar  object,  such  as  a  standard  building  brick,  a  onc-dollar  bill,  a  pack  of 
standard-size  cigarettes.  Despite  the  fact  that  all  items  in  the  test  actu¬ 
ally  have  three  dimensions,  only  two  are  depicted,  and  these  are  speci¬ 
fied.  Two  items  are  presented  in  lieu  re  18.0.  There  are  30  items  in  the 
test. 

**  Develops!  at  lfra*i»|H4ftrrs  AAF  Training  Command.  Chief  contributor*:  Cap*.  Richard 
M  llenneman  and  Staff  uf  the  Perceptual  Research  Unit. 
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FIGURE  18.9 

SAMPLE  ITEMS  OF  JUDGMENT  OF  PROPORTIONS 

CP2068  * 

(1)  Administration. — The  examinees  arc  told  that  this  is  a  test  of 
ability  to  judge  the  shape  of  familiar  objects.  Particular  attention  is 
called  to  the  directions  concerning  the  dimensions  of  the  objects  that  are 
to  be  considered.  Testing  time  for  the  30  items  is  8  minutes. 

(2)  Scoring. — The  scoring  formula  is  R—W/4. 

Statistical  results.  (1)  Distribution  statistics. — Typical  examples  of 
distribution  statistics  obtained  on  this  test  arc  given  in  tabic  18.22.  The 
distribution  curves  are  approximately  symmetrical  and  somewhat  flatter 
than  normal. 


Table  18.22. —  Distribution  constants  for  Judgment  of  Proportions,  CP206B, 
for  samples  of  unclassified  aviation  students 


N 

M 

SD 

1,098 

10.9 

4.4 

392 

12.0 

4.( 

(2)  Reliability  coefficient. — By  the  odd-even  method,  an  estimated  re¬ 
liability  coefficient  of  0.52,  corrected  for  length,  was  obtained.  This 
figure  is  based  on  a  sample  of  1,098  unclassified  aviation  students. 

(3)  Factorial  composition. — The  most  significant  loadings  arc  in  the 
planning  (0.30),  visualization  (0.29),  verbal  (0.22),  and  perccptual- 
speed  (0.22)  factors.  The  communaiity  is  0.35.  For  a  fuller  picture  of 
the  factorial  composition  of  this  test,  see  appendix  B. 
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(4)  Test  validity. — Validation  results  liastd  on  several  .samples  ar. 

given  in  table  18.23. 


Table  18.23. —  Validation  data  for  Judgment  of  Proportions,  CP206B,  for  the 
graduation-elimination  criterion 


Croup 

Cla*» 

N, 

1 

t. 

M, 

M. 

Sl», 

r»,. 

I'ilot*  in  primary  training  .... 

4 lit:  411 

>3*0 

0.60 

11.  SI 

10.95 

4.38 

0.0* 

1)0  . . . 

4JK 

>178 

.87 

12.93 

11.58 

4.47 

.16 

Klrxitjlc  cunnrry  student*  . . . 

>327 

.93 

13.51 

11.73 

1.78 

.24 

*  Tested  at  Psychological  Research  Unit  No.  I. 

*  Tested  at  Psychological  Research  Unit  No.  3. 

*  Unidentified. 


Evaluation. — This  test  was  included  in  two  factor  analyses.  The 
weighted  average  of  the  factor  loadings  shows  that  35  percent  of  the 
total  variance  has  been  accounted  for  by  the  common  factors.  Of  this, 
the  verbal  and  perceptual-speed  factors  each  contribute  5  percent,  the 
visualization  factor  8  percent,  and  the  planning  factor  9  percent,  Hu* 
remaining  8  percent  is  accounted  for  by  other  factors  on  which  the  load¬ 
ings  are  quite  low.  Since  the  test  was  designed  to  measure  visual  imagery, 
among  other  things,  it  is  moderately  successful,  for  the  visualization 
factor  accounts  for  about  a  quarter  ot  the  common-factor  variance.  In 
terms  of  the  total  variance,  however,  the  visualization  portion  is  quite 
small.  In  addition  to  this  fact,  the  test  has  a  very  low  reliability  (0.52), 
and  the  problem  of  presenting  objects  equally  familiar  to  all  examinees 
was  not  solved  satisfactorily. 

The  elementary  pilot  validity  coefficient  is  quite  low  (weighted  aver¬ 
age  approximately  0.08).  An  estimated  validity  coefficient,  computed 
f'on*  the  valid  factor  loadings  (sec  ch.  -28)  is  0.18,  which  suggests  a 
.^.moling  error  in  the  obtained  validity.  At  least  one  can  say  that  all  the 
iactors  in  tne  Judgment  of  Proportions  test  that  arc  valid  for  pilot  train¬ 
ing  have  been  fully  accounted  for. 

This  test  utilizes  subject  matter  different  from  any  other,  hut  there 
exists,  in  other  tests,  better  measures  of  the  valid  factors  included  in 
Judgment  of  Proportions,  and  these  tests  have  a  considerably  higher  reli¬ 
ability.  For  these  reasons,  further  work  with  the  test  was  not  considered 

worth  while. 


SUMMARY  AND  EVALUATION 

Under  the  ca’egory  of  size  and  distance  perception,  seven  printed  tests 
involving  It*  nonlinear  extents,  three  involving  angular  magni¬ 

tudes,  and  .  -  iving  perception  of  proportion,  were  developed  and 
studied. 

Little  evaluation  can  be  given  at  present  of  the  three  tests  of  angular 
estimation,  since  factorial  data  arc  not  available,  and  validity  data  are 
available  for  but  one  of  the  tests.  There  is  an  interesting  possibility,  how- 
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ever,  that  they  may  define  a  new  factor,  perhaps  valid  for  one  or  more  of 
the  air-crew  positions.  The  available  evidence  suggests  that  Angle  Estima¬ 
tion  be  revised,  utilizing  slightly  grosser  stimulus  differences. 

The  Judgment  of  Proportions  test  scents  to  be  of  little  value.  It  had 
little  or  no  validity  to  show  for  the  pilot  criterion,  though  from  its  factor 
content  one  would  have  expected  more.  Its  reliability  is  highly  unsatis¬ 
factory  ;  and  it  is  factoriaily  complex,  with  no  very  high  loading  on  any 
factor  yet  established.  There  is  an  indication  that  some  of  the  variance 
of  the  test  has  not  been  accounted  for,  but  it  holds  no  promise  for  pilot 
validity. 

The  other  tests  discussed  in  this  chapter,  perliaps  with  the  exception 
of  Distance  Estimation,  CP212A,  may  be  considered  as  a  group.  It 
should  be  noted,  first,  that  none  of  the  tests  for  which  data  are  available 
is  a  very  satisfactory  measure  of  the  length-estimation  factor.  Of  the 
tests  considered  in  this  cliapter,  the  highest  loading  secured  on  length- 
estimation  was  0.46,  for  the  Shortest  Path  test.1*  This  test,  however,  is 
contaminated  with  loadings  in  the  perceptual-speed,  visualization,  and 
spatial-relations  factors.  Shorter  Line,  with  a  loading  of  0.44,  is  much 
more  pure.  Increasing  its  reliability  should  make  this  test  quite  satisfac¬ 
tory.  This  was  attempted  in  the  Estimation  of  Length  test,  for  which  fac¬ 
torial  data  are  not  yet  available. 

Comparing  the  factor  patterns  of  Shorter  Line,  Nearest  Point,  Shortest 
Path,  Map  Distance,  and  Path  Length,  we  find  very  strong  evidence  that 
the  best  length-estimation  test  requires  a  simple  comparison  of  two  linear 
lengths  presented  as  continuous,  uninterrupted  extents,  regularly  placed 
in  one  orientation,  either  horizontal  or  vertical.  The  introduction  of  ex¬ 
tents  bounded  by  points,  of  curved  or  irregular  lines,  of  distracting  back¬ 
ground,  of  irregular  placement  (introducing  the  vcrtical-liorizontal  illu¬ 
sion) — all  seem  to  bring  in  other  factors,  particularly  visualization.  The 
results  for  one  test  (Pattern  Assembly;  see  ch.  17),  however,  are  against 
these  conclusions.  A  decision  must  «wait  further  factorial  evidence. 

Three  other  points  are  worth  mentioning.  First,  the  fact  that  all  the 
tests  considered  in  this  chapter  deal  with  limited  sizes  and  distances 
raises  the  question  of  the  generality  of  the  length-estimation  factor.  Will 
tests  of  the  ability  to  perceive  correctly  much  greater  distances  involve 
the  same  factor?  Evidence  from  experimental  psycltology  suggests  an 
affirmative  answer  to  the  question. 

Secondly,  the  analysis  of  two  of  the  tests  in  this  chapter  served  to  call 
sharp  attention  to  the  potentially  distinct  roles  of  incorrect  and  correct 
responses,  and  to  the  dangers  inherent  in  the  indiscriminate  use  of  a 
priori  scoring  formulas.  The  data  for  the  Map  Distance  test  dramatically 
show  how  the  factorial  composition  of  a  test  may  change  with  varying 
penalties  for  incorrect  responses.  The  data  for  the  Estimation  of  Length 

B  r*n*»»  AutnUf,  CPS04A  k*»  t  ItMjtnc  •(  O  S?  m  ik«  leagtk  miatliM  (wltf.  S<i  lU 
Jwmiim  •(  U-«  ttM  in  (i  17. 
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test  show  a  well-substantiated  positive  correlation  between  numbers  of 
correct  and  incorrect  responses. 

Thirdly,  the  promise  of  the  Distance  Estimation  test,  utilizing  photo¬ 
graphic  representation  of  extended  distances,  is  most  i-ncoiiragin;  If  the 
correlation  of  distance  estimation  in  thrce-diincnional  space  with  estima¬ 
tion  in  space  represented  in  two  dimensions  is  well  established  in  a 
large-scale  study,  and  if  the  correlation  is  sufficiently  high,  the  potential 
usefulness  of  printed  tests  will  be  greatly  extended. 


476 


GiiAPTIB  NINETEEN 


Spatial  Tests1 


RATIONALE  FOR  SPATIAL  TESTS 
The  Concept  of  Spatial  Ability 

The  concept  of  a  measurable  spatial  ability  was  brought  to  attention 
primarily  as  a  result  of  factor  analyses  of  intellectual  abilities.  Corrobo¬ 
rating  the  earlier  work  of  Kelley  (1),  Thurstone  listed  a  factor  S,  de¬ 
scribed  as  “facility  in  spatial  and  visual  imagery"  among  his  seven  pri¬ 
mary  mental  abilities  (2). 

Unlike  other  chapter  designations  in  this  volume  the  category  "spatial 
tests”  is  not  an  old  and  familiar  term  in  differential  psychology.  In  fact, 
the  concept  is  still  being  formed.  The  concept  of  spatial  ability,  like  other 
concepts  bom  of  factor  analysis,  was  developed,  inductively.  In  the  in¬ 
ductive  approach,  statistical  evidence  is  the  basis  for  deriving  a  new 
psychological  concept.  For  many  of  the  older  psychological  concepts 
almost  the  reverse  has  been  true.  A  variable  in  which  individuals  differ 
would  be  assumed,  then  tests  would  be  built  to  measure  it,  and  statistical 
results  would  be  compiled  to  describe  individuals  and  populations  with 
respect  to  it 

Factor  Studies  in  Aviation  Psychology 

In  the  work  of  the  AAF  Psychological  Research  Units,  an  effort  to 
study  the  nature  of  spatial  abilities  was  first  stimulated  by  the  appearance 
of  a  new  and  unidentified  factor  in  an  analysis  of  a  battery  of  tests  de¬ 
signed  to  measure  forcsight-and-plnnning  ability  (sec  ch.  9).  For  want 
of  a  better  term,  the  factor  was  labeled  the  “nonmotor”  or,  sometimes, 
the  "intellectual”  component  of  Complex  Coordination,  since  its  highest 
loading  was  in  that  test.*  It  was  recognized  that  this  label  left  much  to 
be  desired,  but  the  diverse  nature  and  limited  number  of  tests  permitted 
no  more  |x>sitivc  designation.  It  was  suggested  as  one  hypothesis,  how¬ 
ever,  that  this  might  be  a  spatial  (actor,  since  the  tests  with  substantial 
loadings  called  for  the  correlation  of  spatial  ar.  ngements  in  stimulus 
and  response. 

This  new  find’  >g  was  of  particular  interest,  because  the  pilot  validity 
of  this  test  was  very  high ;  yet  its  predictive  value  was  not  fully  ac¬ 
counted  for  either  by  the  factors  already  identified,  including  mechanical 

•  Written  fcy  |j  John  \V.  Howe  Jr.  ind  Siaff/Sgt.  Wayne  S  Ziamcmut, 

1  See  page  1 22  hr  a  brief  dtKnptton  of  this  ie*t,  »n4  Report  No.  4  for  a  too^ric  frorlptUo. 
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experience  and  perceptual  speed,  or  by  assumed  motor-coordination  fac¬ 
tors.  If  Complex  Coordination’s  validity  were  due  in  part  to  its  loading 
on  the  new  factor,  important  implications  could  be  drawn.  It  was  de¬ 
cided  that  the  new  factor  should  be  fully  explored  and  identified.  Since 
it  appeared  to  be  measurable  by  means  of  printed  tests,  it  was  felt  that 
efforts  should  be  made  to  maximize  its  content  in  such  more  economical 
tests. 

A  few  months  subsequent  to  the  aforementioned  analysis,  two  other 
analyses  were  completed,  both  of  which  verified  the  new  factor.  In  the 
first,  Instrument  Comprehension  II,  CI616B,  Complex  Coordination  and 
Planning  Air  Maneuvers  appeared  together;  and  in  the  second,  Planning 
A  Course,  CI406AX2,  Planning  Air  Maneuvers,  and  Complex  Coordi¬ 
nation  defined  a  factor. 

The  new  tests  were  scrutinized  for  further  clues.  Planning  A  Course 
and,  to  a  lesser  extent.  Instrument  Comprehension  II,  formerly  had  been 
described  as  tests  of  "integrative  ability,”  i.  e.,  the  ability  to  synthesize 
quickly  the  influence  of  several  environmental  factors  that  bear  upon  the 
choice  of  a  single  direction  of  action.  Accordingly,  the  hypothesis  was 
cntcitaincd  for  a  while  that  the  factor  was  one  of  integrative  ability.  But 
later  evidence  weakened  this  belief  and  supported  instead  the  earlier  sur¬ 
mise  that  the  factor  '  as,  in  reality,  one  of  spatial  ability. 

Later,  Psychological  Research  Unit  No.  3  completed  analyses  of  two 
ether  batteries,  both  of  which  confirmed  the  factor.  In  the  first  of  these 
analyses,  Discrimination  Reaction  Time,  CP611D,  Complex  Coordina¬ 
tion,  Two-Hand  Coordination,  CM101A,  and  Dial  and  Table  Reading, 
CF622A  and  CP621A,  appeared  together  with  strong  projections  on  a 
common  reference  axis.  In  the  second  analysis  the  following  tests  had 
substantial  projections  on  a  single  axis:  Complex  Coordination;  Flags, 
Figures,  and  Cards,  CP512A;  Directional  Orientation,  CP515;  Cubes, 
CP512A ;  Table  Reading,  CP621A;  and  Dial  Reading,  CP622A. 

Purposely  included  in  the  second  correlational  matrix  were  two  of  the 
tests  that  lliurstonc  had  originally  found  to  determine  his  spatial  factor, 
namely,  Cubes,  and  IHags,  Figures,  and  Cards.  Their  emergence  on  the 
factor  sup|>ortcd  the  case  for  the  spatial  hypothesis. 

In  attempting  to  define  this  spatial  ability  in  more  specific  terms,  a 
great  deal  of  difficulty  has  been  encountered.  As  mentioned  in  the  intro¬ 
ductory  paragraph  of  this  chapter,  Thurstone  described  his  spatial  factor 
as  "facility  in  visual  and  spatial  imagery.”  He  failed  to  isolate  an  addi¬ 
tional  factor  that  could  be  called  visualization,  although  further  rotations 
involving  one  of  his  residual  axes  would  have  revealed  it.  Actually  the 
expression  "facility  in  the  use  of  visual  imagery”  better  describes  a 
factor  which  in  this  volume  is  labeled  visualization. 

The  precise  definition  of  the  spatial  factor  must  wait  until  more  crucial 
results  are  available.  It  is  generally  agreed  that  the  factor  is  a  spatial  one, 
but  beyond  that  point  there  are  divergences  of  opinion.  Two  hypotheses 
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which  have  been  proposed  doserve  consideration  here.  According  to  one 
hypothesis,  it  is  an  ability  to  make  discriminations  as  to  direction  of 
motion.  The  term  “discrimination"  here  does  not  carry  the  usual  conno- 
taton  of  perceiving  small  differences,  for  obviously  the  spatial  distinc¬ 
tions  called  for  in  the  tests  are  very  gross.  The  decisions  are  frequently 
merely  between  up  and  down,  left  and  right,  and  ont  and  ill. 

Another  hyjx (thesis  is  that  the  ability  is  concerned  with  the  general 
apprehension  of  spatial  relations.  Either  stimuli  or  responses  or  both  in 
the  spatial  tests  arc  arranged  in  spatial  patterns,  and  there  is  frequently 
a  systematic  relationship  between  order  in  the  stimulus  and  order  in  the 
response.  In  such  a  test,  therefore,  the  essence  of  the  spatial  factor  could 
be  (1)  ability  to  pereceive  visual-spatial  arrangements,  (2)  ability  to 
organize  movements  in  spatially-determined  order,  or  (3)  ability  to  re¬ 
late  specific  spatial  locus  or  arrangement  within  the  stimulus  pattern  with 
s|K-cific  locus  or  arrangeificnt  within  the  rcsjjonsc  pattern.  The  second  and 
third  characteristics  would  apply  to  the  psychomotor  tests  but  not  to  all 
printed  tests.  The  first  of  these  three  must  therefore  be  the  most 
significant. 

The  tests  in  this  chapter  are  divided  into  two  subareas.  In  keeping 
with  the  policy  followed  in  general  throughout  this  volume,  the  subarea 
designations  are  selected  on  logical  evidences  which  follow  from  the 
surface  characteristics  of  the  tests  rather  than  on  factorial  grounds. 
Since  the  tests  in  the  first  subarea  tend  to  call  for  distinctions  as  to  direc¬ 
tion,  the  title  selected  for  the  group  is  "Directional  Discrimination  Tests.” 
The  second  section  will  he  entitled  “Positional  Discrimination  Tests.”  As 
it  turned  out,  the  division  of  tests  along  these  lines  coincides  quite  well 
with  the  distinction  between  two  space  factors,  St  and  S*.  which  will  be 
described  later  in  this  chapter. 

DIRECTIONAL  DISCRIMINATION  TESTS 

The  following  seven  tests  are  treated  in  the  first  subarea:  Instrument 
Comprehension  I,  CI615A,  B;  Instrument  Comprehension  II,  CI616A, 
B,  C;  Aerial  Orientation,  CP520A,  B;  Flight  Orientation,  CP528A; 
Stick  and  Rudder  Orientation,  CP531A;  Discrimination  Reaction  Time 
(paper),  CP634A;  and  Directional  Marking,  CP533A. 

Instrument  Comprehension,  CIC15B  3  and  CI616B  * 

This  test  consists  of  two  parts  sufficiently  distinct  to  warrant  two  code 
numbers.  Each  was  designed  to  measure  certain  aspects  of  the  ability  to 
comprenend  a  plane’s  behavior  on  the  basis  of  instrument  readings. 
When  the  test  idea  was  first  proposed  in  July  1942,  it  was  called  Dial 

*  Developed  at  P'ycholoKicil  ReieArch  t'nit  No.  J.  Chief  contributor!  to  the  original  form 
CI6I5A:  Cam.  Stilton  Ilurdn.an  And  l.l.  Wilbur  S.  (irrguijf.  Chief  contributor  to  form  CI6ISB: 
l.t.  David  II.  JenVini. 

4  Developed  At  ISyrhologicAl  He^eArch  Unit  No.  J.  Chief  contributor  to  the  origin»l  form 
CH16A:  (.Apt.  Milton  llurdmAn.  Chief  contributor  to  form  Cl6lbll:  l.t.  Dieiil  H.  Jenkliu. 
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Interpretation,  and  the  following  rationale  was  presented : 

This  test  will  measure  several  abilities: 

(1)  Ability  to  visualize  the  behavior  of  the  plane  in  space. 

(2)  Ability  to  translate  a  verbal  description  of  the  plane's  behavior  into  a  visual 
image  of  the  plane  in  that  maneuver. 

(J)  Ability  to  interpret  dial  readings  in  terms  of  the  plane's  behavior. 

(4)  Skill  at  dial  reading. 

(5)  Understanding  of  the  relations  between  1,  2,  and  3  above. 

(6)  Ability  to  keep  in  mind  several  factors  at  one  time. 

Description. — Form  CI615B-CI616B  is  chosen  for  presentation  here, 
since  it  is  the  most  representative  form  and  the  first  to  be  included  in 
the  classification  battery. 

For  each  item  in  part  I  (CI615B),  the  examinee  mu  refer  to  draw¬ 
ings  of  six  instruments — altimeter,  compass,  air-speed  indicator,  artifi¬ 
cial  horizon,  ratc-of-climb  dial,  and  turn-hank  indicator.  lie  must  then 
select  the  correct  one  of  five  written  descriptions  of  a  plane’s  position  in 

flight. 

In  part  II  (CI616B),  each  item  presents  drawings  of  only  two  instru¬ 
ments,  compass  and  artificial  horizon,  followed  by  five  small  photo¬ 
graphs  showing  an  airplane  in  five  different  positions,  e.  g.,  headed 
north  and  climbing,  headed  northwest  and  diving,  etc.  The  examinee 
must  choose  the  picture  which  is  in  agreement  with  the  two  instrument 
readings. 

Part  II  seems  to  he  superior  to  part  I  on  an  a  priori  basis.  It  requires 
the  examinee  to  relate  the  dial  readings  and  plane’s  behavior  in  a  much 
more  direct  and  simplified  manner  than  part  I.  Also,  it  does  not  involve 
a  possible  reading-speed  factor  and  the  verbal-comprehension  factor 
which  seem  to  he  apparent  in  part  I.  The  verbal  nature  of  the  items  in 
part  I  is  comparable  to  the  type  of  directions  that  instructors  give  stu¬ 
dent  pilots,  however,  so  that  part  I  possibly  makes  a  contribution  to  the 
test  that  part  II  does  not.  Also,  because  part  I  involves  several  dials,  it 
docs  measure  ability  to  attend  to  several  factors  at  once  to  a  greater  ex¬ 
tent  than  part  II  does.  In  view  of  these  facts,  it  was  originally  believed 
that  the  test  should  include  both  types  of  items  until  their  validities  had 
lnen  experimentally  determined.  It  will  he  seen  later  that  Instrument 
Comprehension  proved  to  he  quite  valuable,  but  not  exactly  for  the  the¬ 
oretical  reasons  first  stated.  It  remained  for  factor  analysis  to  reveal  the 
underlying  sources  from  which  the  test  derived  the  high  validity  that 
it  achieved. 

(1)  Internal  characteristics. — Tart  I  contains  1  unrecorded  and  un- 
scorcd  practice  item  and  15  recorded  and  scored  test  items.  Part  II  con¬ 
tains  60  scored  items. 

(2)  Administration. — For  part  I  the  instructions  and  the  single  prac¬ 
tice  problem  following  them  require  about  10  minutes.  Working  time 
for  part  I  is  12  minutes,  which  allows  aliout  80  percent  of  the  examinees 
to  finish.  The  time  for  part  II  is  4  minutes  for  instructions  and  15  min- 
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utcs  working  time.  Approximately  40  percent  of  the  examinees  finishes 
in  the  allotted  time.  Thus,  in  this  part,  a  greater  premium  is  placed  upon 
speed. 

Dials  for  sample  A  are  shown  in  figure  ID. I.  The  following  are  ex¬ 
cerpts  from  the  directions  for  part  I : 


ARTIFICIAL 

HORIZON  AIR  SPEED 


altimeter  compass  rate  of  turn-bank 

CLIMB 


FIGURE  19,1 

DIALS  FOR  SAMPLE  PROBLEMS, 

INSTRUMENT  COMPREHENSION  1,  C16I5B 

This  is  a  test  of  your  ability  to  interpret  dial  readings  on  the  instrument  panel  of 
an  airplane. 

Six  of  the  dials  commonly  found  in  military  planes  arc  used.  Dial  I  is  •  •  • 

Each  of  the  six  dials  is  then  quite  fully  explained  and  illustrated.  Fol¬ 
lowing  these  explanations,  the  directions  continue: 

In  each  problem  of  part  I  diagrams  of  the  six  dials  appear  on  the  left  page  of  the 
l>ooklet.  Opposite  the  dials,  on  the  right  page,  arc  five  written  descriptions  of  the 
plane's  behavior.  You  arc  to  examine  the  dials  on  the  left  page,  then  choose  the  cor¬ 
rect  description  from  those  appearing  opposite  the  problem  on  the  right  page: 

Sample  Problem  A 

A.  Flying  level  at  200  m.  p.  h.  straight  and  unbanked,  headed  due  south,  gain¬ 
ing  altitude  at  0,800  feet. 

B.  Flying  level  at  200  m.  p.  h.  straight  and  unbanked,  headed  due  south,  los¬ 
ing  altitude  at  5,000  feet. 

C.  Hying  level  at  200  in.  p.  h.  straight  and  uubanked,  headed  due  south,  main¬ 
taining  altitude  at  4,000  feet. 

D.  Flying  level  at  200  m.  p.  h.  straight  and  banked  to  left,  beaded  due  north, 
maintaining  altitude  at  4,000  feet. 

F.  Flying  level  at  200  m.  p.  h.  turning  properly  to  left,  with  JO*  bank,  main¬ 
taining  altitude,  headed  due  north  at  4,000  feet. 

Description  C  is  the  correct  answer. 

Part  of  the  directions  from  part  II  follow,  and  a  sample  problem  is 
shown  in  figure  19.2. 

In  each  of  the  problems  in  part  II  you  will  be  given  a  picture  of  a  single  plane  in 
five  different  positions.  At  the  left  of  the  picture  you  will  be  shown  two  dials,  an 
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artificial  horizon  an<!  a  compass.  You  arc  to  tlioo«c  the  position  of  the  plane  which 
asffcs  with  the  readings  on  these  dials. 

Now  examine  the  dial  readings  at  the  left  of  the  pictures  in  die  sample  problem. 
Then  look  at  the  five  positions  of  the  plane  and  select  the  correct  position. 

According  to  the  dials,  the  plane  is  banked  left,  flying  level,  and  is  headed  south. 
(B)  is  the  correct  answer,  because  at  position  (B)  the  plane  is  banked  left,  flying 
level,  and  is  headed  south. 

(3)  Scoring. — The  scoring  formula  used  for  part  i  (CI615B)  is  of 
particular  interest,  Itccausc  it  is  the  only  instance  in  attv  classification 
battery  in  which  correct  answers  actually  decrease  the  score.  The  formula 
for  the  score  on  part  I  was  20  —  (R  —  W/4).  The  reason  for  giving 
negative  weight  to  correct  answers  is  that  part  I  was  found  to  possess 
a  low  positive  validity  and  such  high  correlations  with  other  tests  that 
its  beta  weight  was  negative.  Accordingly,  in  scoring,  the  axes  were  re¬ 
versed,  with  high  scores  becoming  low  scores  and  vice  versa.  The  data 
presented  below,  however,  .ire  for  the  formula  R— W/4. 

Part  II  was  scored  simply  R— W/4. 

Statistical  results. — Since  instrument  comprehension  was  a  classifica¬ 
tion-battery  test,  the  statistical  data  arc  very  complete. 

(1)  Distribution  statistics. — Typical  examples  of  distribution  statistics 
are  given  in  table  19.1.  The  distribution  curves  for  part  I  and  part  II  arc 
negatively  skewed  and  considerably  flatter  than  normal. 


Tahi.k  19. t. — Distribution  constants  for  Instrument  Comprehension,  CI6I5B  and 

CI616B 
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I'nrlji'iM  aviation  students  (PortCTD)  . 
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10.6 
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CI6I6B  . 

I'nrlivxifird  aviation  student*  (1‘ovt  CTD)  . 

*1.500 

29.6 

10.1 

061  SB  . 

Armament  trainee*4  . 

•269 

12.5 

2.9 

CI6I6B  .. 

Armament  trainee*4  . 

•269 

22.6 

9.J 

1 CTD  Hindi  lor  College  Training  Detachment, 

1  Tetted  at  Medical  and  Psychological  Examining  lTnit»  Nov  a  tbroueh  to  with  the  November 
IU)  battery. 

•  Tevted  at  Pxycholotkil  Roearch  Unit  N*».  t.  2,  and  J  with  the  November  |94J  baiter?. 

•  Previously  eliminated  from  pilot  training. 

*I»  Luwry  field  elatvea  J4-44A  and  J5-44A.  Sample  eoniittv  of  examinee*  tested  at  all 
unit*  with  tbe  November  |9 * J  battery. 

(2)  httcrmil  consistency. — The  degree  of  homogeneity  of  |>art  II  of 
the  test  is  indicated  by  a  mean  internal -consistency  phi  of  0.38,  a  stand¬ 
ard  deviation  of  the  phi  distribution  of  0.14,  anti  a  range  of  values  from 
—0.16  to  0.68.  The  criterion  used  was  the  score  on  part  II.  These  statis¬ 
tics  are  hased  upon  analysis  of  the  responses  of  the  highest  25  percent 
and  lowest  25  percent  in  total  score  of  a  group  of  800  unclassified  avia¬ 
tion  students,  tested  in  October  1943  at  Psychological  Research  Unit 
No.  3. 

(3)  Reliability  coefficient. — Three  samples  yielded  the  estimates  of  re¬ 
liability  given  in  table  19.2. 
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FIGURE.  19.3 

SAMPLE  ITEM  OF  AERIAL  ORIENTATION 
CP520A 


FIGURE  19.4 

SAMPLE  ITEMS  OF  FLIGHT  ORIENTATION 
CP329A 


19.2. —  Estimated  reliability  coefficients  for  Instrument  Comprehension  based 
ufon  samples  of  unclassified  aviation  students 
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CI6I6B*  .. 

•1,000 

....do  . 

.88 

.94 

CI616C  .. 
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111  Waller* 

■  ivhcb  ii  oicuicai  «nu  rsycnoiogicxi  examining  unit  no,  IV  wun  me  novemoer  diuctt, 

*  Note  that  this  form  ii  highly  speeded;  odd-even  -e liability  eitimatea,  therefore,  are  spuriously 
high. 

■Tested  at  Medical  and  Psychological  Unit  No.  7  with  the  November  1943  battery. 

4  Tested  at  Medical  and  Psychological  Examining  Unit  No.  8  in  early  194$. 

(4)  Difficulty. — Based  upon  item  analysis  of  the  responses  of  the 
above-mentioned  sample  of  800  unclassified  students,  part  II  of  the  test 
yielded  a  mean  proportion  of  correct  responses  of  0.68,  corrected  for 
chance,  with  a  range  from  0.24  to  0.98  and  a  standard  deviation  of  0.17. 

(5)  Factorial  composition. — The  most  prominent  loadings  for  part  I 
(for  the  preliminary  form  CI615A;  sec  below)  are  in  the  spatial-rela¬ 
tions  (0.44),  reasoning  II  (0.34),  psychomotor  III  (0.24),  verbal  (0.22), 
general-reasoning  (0.21),  and  integration  II  (0.21)  factors.  The  most 
prominent  loadings  for  part  II  are  in  the  spatial-relations  (0.53),  reason¬ 
ing  II  (0.36),  visualization  (0.25),  and  verbal  (0.24)  factors.  The  com- 
munality  for  part  I  is  0.57  and  for  part  II  0.65.  For  a  full  picture  of  the 
factorial  composition  of  this  test  see  Appendix  B. 

(6)  Test  validity. — Validation  results  based  on  several  samples  are 
given  in  tables  19.3  to  19.6  for  form  A  as  well  as  II  Form  B.  (See 
below  for  a  brief  description  of  Form  A.) 

(7)  Item  validity. — Validation  of  items  of  this  test  disclosed  the  re¬ 
sults  recorded  in  table  197. 

Variations. — CI615A-CI616A  was  the  original  experimental  form,  in 
which  part  I  contained  15  items,  and  part  II,  30  items.  Part  I  had  only 
low  validity,  while  part  II's  validity  was  high  enough  to  indicate  that  it 
should  be  amplified  in  a  revision. 

CI615B-CI616B  was  the  first  experimental  revision,  in  which  part 
II  was  expanded  to  60  items.  Part  I  remained  unchanged. 

CI616C*  is  the  final  modification  which  replaced  CI615B-CI616B  in 
the  classification  battery  in  September  1944.  It  consists  of  the  60  items 
of  part  II,  with  a  new  set  of  directions.  Part  I  is  entirely  eliminated  ex¬ 
cept  for  portions  of  its  directions  explaining  the  functions  of  the  dials. 

Evaluation. — Instrument  Comprehension  played  a  very  interesting  and 
significant  role  among  the  tests  used  in  the  psychological  program.  It 
was  admitted  to  the  classification  battery  in  November  1943,  after  more 
than  a  year  of  development.  Early  data  indicated  that  part  I  had  a  nega¬ 
tive  beta  weight  Later  data  resulted  in  a  zero  or  slightly  positive  weight 
Thus  part  I  was  a  candidate  for  withdrawal,  as  soon  as  it  could  be  estab¬ 
lished  that  performance  on  part  II  would  not  be  adversely  affected  by 

•  DtttltH  at  PfjcSalaciral  iMtartk  Uail  tf*.  S.  Ckief  caatriVutw :  La.  David  M.  JttUaa 
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Tablx  19.7, —  Validity  of  items  of  Instrument  Comprehension  based  upon  graduation- 
elimination  of  pilots  in  primary  training  (in  class  44  H ;  S  =  704 ;  p$  —  .85) 
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the  fact  that  St  was  not  preceded  by  part  I.  This  change  was  accomplished 
in  September  1944,  when  part  II  was  given  a  new  set  of  directions  and 
was  continued  in  the  battery  as  CI616C. 

Its  contribution  was  at  least  two-fold.  First,  it  was  a  substantial  mem¬ 
ber  in  the  family  of  valid  group  tests.  Second,  in  factor  analysis  studies, 
it  helped  to  shed  light  on  the  question  of  why  certain  other  tests  were 
valid,  thus  illuminating  the  way  for  further  test  construction. 

That  Instrument  Comprehension’s  validity  was  due  in  great  part  to  its 
measurement  of  the  spatial-relations  factor  was  the  subsequent  revela¬ 
tion  of  factor  analysis.  To  realize  what  a  gain  in  insight  this  was,  we 
have  only  to  refer  back  to  the  original  rationale  for  the  test  and  note  the 
six  different  abilities  it  was  supposed  to  measure.  As  things  turned  out, 
it  measures  only  one  ability  related  to  the  six  (visualization)  and,  in 
addition,  it  measures  several  other  quite  differently  defined  abilities.  This 
gain  in  understanding  pointed  the  way  to  a  new  line  of  research  and  test 
construction,  aimed  at  developing  printed  tests  of  spatial  ability.  Results 
show  that  no  other  new  or  unique  factor  was  needed  to  account  for  the 
validities  of  approximately  0.20  for  part  I  and  0.30  for  part  II  (see 
table  28.18). 

Aerial  Orientation,  CP520A  • 

Instrument  Comprehension  sampled  the  ability  to  interpret  position 
from  instrument  readings,  but  no  measure  had  been  constructed  to  test 
the  ability  to  interpret  position  from  outside  cues  visible  from  the  pilot’s 
position  in  the  cockpit.  It  was  hypothesized  that  the  latter  ability  is  more 
important  to  the  pilot,  Uith  in  training  and  in  combat  flying.  Aerial  Orien¬ 
tation  was  designed  to  measure  reaction  to  external  cues  seen  from  within 
the  cockpit.  It  was  intended,  due  to  the  necessity  o»  the  pilot  s  perceiving 
plane  attitude  on  the  basis  of  spatial  cues  rather  than  on  the  basis  of 
intellectual  interpretations,  that  Aerial  Orientation  should  contain  more 
sjiati.il  and  less  verbal  and  reasoning  factor  content  than  Instrument 
Comprehension. 

Description. — For  each  item,  a  cockpit  view  is  shown  diagrammatically 
on  the  left-hand  side  of  the  page,  set  opposite  photographs  of  a  model 
airplane  in  five  different  attitudes.  The  cockpit  views  are  drawings  rep¬ 
resenting  patterns  of  land  and  water  that  might  be  seen  from  an  airplane 
maneuvering  over  a  point  on  a  coastline.  The  examinee’s  problem  is  to 

*n*»nS  Vml  M*.  I.  CVrf  S/S*l  Wipt  1 


486 


match  the  cockpit  view  with  the  airplane  from  which  that  view  would  be 
seen. 

(1)  Internal  characteristics. — The  directions  contain  two  recorded 
hut  unscored  sample  items.  Part  I  contains  30  scored  items,  and  part 
II,  28. 

(2)  Administration. — Six  minutes  arc  allowed  for  administration  of 
the  directions,  10  minutes  for  doing  items  in  part  I,  and  8  minutes  for 
doing  items  in  part  II,  making  a  total  testing  time  of  24  minutes. 

A  sample  item  is  shown  in  figure  19.3.  Following  are  part  of  the 
directions : 

This  is  a  test  of  your  ability  to  visualize  the  relationship  between  a  plane  and  the 
territory  over  which  it  flies. 

Look  at  the  sample  problem.  The  picture  on  the  left  represents  the  landscape  as  it 
would  appear  to  the  (•Hot  of  one  of  the  five  planes  on  the  right.  Your  task  is  to 
select  the  plane  from  which  the  pilot  would  see  this  view  when  he  looks  directly 
over  the  nose  of  his  plane.  Notice  that  in  the  group  of  five  pictures  the  ocean  is  on 
the  right  and  land  is  on  the  left,  with  the  coast  line  directly  under  the  plane. 

Study  the  sample  problem  and  select  the  plane  from  which  the  pilot  would  see  the 
landscape  as  it  appears  in  the  left  picture. 

According  to  the  picture  on  the  left,  the  correct  plane  is  flying  level,  unbanked, 
and  headed  directly  oxer  the  land.  Thus  D  is  the  correct  answer  because  from  this 
plane  the  pilot  would  sec  only  land,  and  the  horizon  would  appear  to  he  level 

(3)  Scoring. — The  scoring  ^rmula  is  R— W/4. 

Statistical  results. — The  data  give  t  below  all  arc  based  upon  examinees 
tested  at  Psychological  Research  Unit  No.  3. 

(1)  Distribution  statistics. — Typical  distribution  statistics  are  given  in 
tabic  19.8.  The  distribution  curves  are  symmetrical  and  considerably  flat¬ 
ter  than  normal. 

Table  19.8. —  Distribution  constants  for  Aerial  Orieniatior.,  CP520A  based  upon 

samples  of  classified  pilots 
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(2)  Internal  consistency. — The  degree  of  homogeneity  of  ihe  lest  is 
indicated  by  a  mean  internal  -consistency  phi  of  0.42,  a  standard  devia¬ 
tion  of  the  phi  distribution  of  0.12,  and  a  range  of  values  from  0.14  to 
0.60.  These  statistics  arc  based  upon  analysis  of  the  responses  of  the 
highest  27  percent  and  ihe  lowest  27  percent  in  total  score  of  a  group  of 
750  unclassified  aviation  students  tested  in  July  1944. 

(3)  Reliability  coefficient. — By  correlating  part  I  with  part  II,  an 
estimated  reliability  coefficient  of  0.84,  corrected  for  length,  was  ob¬ 
tained.  This  figure  is  based  on  a  sample  of  443  classified  pilots  in  class 
44H. 
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(4)  Correlation  betzveen  rights  and  wrongs. — For  a  sample  of  330 
pilots  tested  in  the  period  from  July  4  to  July  26,  1944,  the  correlation 
between  rights  and  wrongs  was  —077.  For  another  independent  sample 
of  500  unclassified  aviation  students  (tested  early  in  1945)  the  correla-  j 

tion  was  — 075.  f 

|  (5)  Difficulty.  —  Based  upon  the  responses  of  the  above-mentioned 

j  sample  of  750  unclassified  aviation  students,  the  test  yielded  a  mean  pro-  I 

portion  of  correct  responses  of  0.63,  corrected  for  chance,  a  standard  | 

*  deviation  of  0.18,  and  a  range  from  0.14  to  0.98.  i 

j  (6)  Test  validity. — Validation  results  based  on  several  samples  are  j 

I  given  in  table  19.9.  ] 

j  (7)  Item  validity. — Validation  of  items  revealed  a  mean  phi  of  0.08,  j 

based  upon  the  responses  of  600  graduates  and  104  eliminees  from  pri-  j 

mary  pilot  training  in  class  4411.  The  standard  ueviation  of  phi  values 
was  0.06,  and  the  range  was  from  —0.12  to  0.22. 

Variations. — Form  CI520BT  is  a  lengthened  version  of  the  onginal 
test  embodying  two  significant  changes.  First,  the  airplane  model  used  in 
the  photographs  is  more  realistic  in  appearance,  and  its  various  attitudes 
arc  more  easily  perceived.  Second,  for  the  purpose  of  simplifying  the  ! 
problems  and  increasing  the  speed  factor,  only  45°  and  90°  turns,  banks, 
and  altitude  changes  (climbs  and  dives)  are  presented.  The  B  form  was 
fully  developed  but  was  not  printed  in  booklet  form,  because  it  was  de-  ■ 
velopcd  very  late  in  the  program,  when  experimental  testing  ceased. 

Form  CI520C  is  made  up  of  the  items  from  part  I  of  Form  A  and  is 
divided  into  two  equal  parts.  } 

Evaluation. — Aerial  Orientation,  CP520A,  proved  to  be  one  of  the 
most  valid  printed  tests  constructed  for  pilot  as  well  as  navigator  selec¬ 
tion.  It  would  have  added  slightly  to  the  total  validity  of  the  classifica-  ' 
tion  battery,  even  with  Instrument  Comprehension  (the  test  it  most  j 

closely  resembles),  as  a  member.  This  finding  indicates  that  it  either  f 

measures  known  valid  factors  better  than  other  tests  that  are  in  the  bat¬ 
tery,  or  that  it  possesses  a  unique  valid  factor  or  factors,  or  both.  Inter- 
correlations  indicate  that,  as  planned,  Aerial  Orientation,  is  probably  a 
purer  measure  of  a  spatial  factor  than  is  Instrument  Comprehension.  The 
indications  arc  that  Aerial  Orientation  would  do  very  well  as  a  substitute 
for  Instrument  Comprehension  in  any  alternate  battery.  Its  extremely 
high  navigator  validity,  however,  indicates  factorial  complexity  and 
somewhat  dims  its  promise  as  a  discriminating  classification  test. 

Flight  Orientation,  CP528A  * 

The  idea  for  Flight  Orientation  was  proposed  at  the  time  Aerial  Orien¬ 
tation  was  being  developed.  It  was  hypothesized  (1)  that  the  ability  visu¬ 
ally  to  maneuver  an  airplane  as  if  from  a  position  outside  the  cockpit  is 

'  Developed  *t  Psychological  Research  Unit  No.  J.  Chief  contributor:  Stsff/Sgt.  Wayne  S. 

Zimmerman. 

•  Developed  at  Psychological  Research  Unit  No.  J.  Chief  contributora:  Prt.  Cbarlea  K  Ter- 
luscn  -aid  Stafl/Sgt.  Wajrne  S.  Zimmerman. 
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a  manipulatory-visualization  ability  and  (2)  that  the  ability  to  imagine 
maneuvers  taking  place  as  if  the  examinee  were  within  the  cockpit  is  a 
spatial-orientation  ability. 

The  Aerial  Orientation  test  utilized  cockpit  views  of  outside  terrain  to 
be  matched  wth  depicted  plane  attitudes;  the  visiialization-of-mancuvcrs 
tests  involved  only  views  of  airplanes  seen  from  a  position  outside  of 
the  cockpit  (see  ch.  12  for  a  discussion  of  these  tests).  Flight  Orienta¬ 
tion  was  designed  to  fulfill  the  requirements  of  the  indicated  variation — 
a  test  that  would  utilize  only  cockpit  views  of  outside  terrain.  From 
hypotheses  given  above,  it  follows  that  Aerial  Orientation  should  measure 
a  combination  of  manipulatory-visualization  and  spatial-orientation  abili¬ 
ties,  while  Flight  Orientation  should  he  a  purer  measure  of  the  ability  to 
orient  in  space. 

Description. — Each  item  consists  of  two  landscape  photographs  taken  • 
from  the  cockpit  of  an  airplane.  The  first  photograph  represents  the  view 
seen  by  the  pilot  when  the  plane  is  in  one  given  position.  The  second 
shows  the  view  seen  after  a  certain  maneuver  has  been  completed.  The  ( 
examinee  must  decide  which  of  a  number  of  maneuvers  has  taken  place. 1 
The  maneuvers  include  right  turns,  left  turns,  right  rolls,  left  rolls,  and ! 
climbs  and  dives.  Answers  are  marked  on  a  special,  overprinted  answer 1 
sheet  to  be  described  later.  ' 

(1)  Internal  characteristics. — Part  I  contains  3  recorded,  but  un-i 
scored,  sample  items  and  47  scored  items.  Part  II  contains  50  scored 
items.  In  Part  I,  only  one  maneuver  is  represented  as  having  been  com¬ 
pleted  between  any  two  photographs.  In  Part  II,  two  combined  maneuv¬ 
ers,  such  as  a  turn  and  a  dive,  are  represented.  In  this  part,  two  descrip¬ 
tive  terms  are  required  for  each  item  in  order  to  describe  the  action. 

(2)  Administration. — Eight  minutes  arc  allowed  for  Part  I  and  14 
minutes  for  Part  II.  The  directions  require  approximately  5  minutes, 
making  a  total  testing  time  of  27  minutes. 

A  sample  item  for  Part  I  is  shown  in  figure  19.4.  Following  are  part 
of  the  directions. 

This  is  a  test  of  your  ability  to  recognize  change  in  flight  position. 

Look  at  the  sample  problem.  Two  pictures  arc  shown.  The  picture  at  the  left 
shows  a  cockpit  view.  The  picture  at  the  right  shows  the  cockpit  view  as  it  appears 
after  a  single  maneuver.  Your  task  is  to  determine  the  maneuver.  The  maneuver  will 
be  one  of  the  following:  Left  or  right  turn;  left  or  right  roll;  climb,  up  or  down. 

The  correct  answer  to  .-.ample  problem  1  is  "right  roll.” 

The  answer  to  item  1  is  marked  coirtclly  on  the  illustration  of  the  answer  sheet 
(  h  r.\n  in  fig.  19.S). 

(3)  Scorinj.-  -The  scoring  formula  used  is  R— W/3. 

Statistical  results. — The  analysis  of  this  test  is  only  partially  com- 
I  K  tc.  The  data  all  arc  based  upon  examinees  tested  at  Psychological  Re¬ 
search  Unit  No.  3. 
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FIGURE  19.5 

SECTION  OF  ANSWER  SHEET  OF  PLIGHT 
ORIENTATION.  CP528A 


(1)  Internal  consistency. — Analysis  of  the  responses  of  one  sample 
group  yielded  the  internal-consistency  data  given  in  table  19.10. 


Table  19.10. —  Internal-consistency  data  for  Flight  Orientation,  CP528A,  based 
upon  a  sample  of  unclassified  aviation  students? 


Fart 

M* 

SDd 

Range  of  0 

Low 

High 

I  . . . 

0.30 

0.14 

0.04 

0.6: 

.74 

II  . 

.37 

.17 

-.10 

»N=750  tested  from  September  26  to  No*.  22,  1944.  The  criterion  is  total  rights  scow, 
parts  I  and  II  combined. 


(2)  Reliability  coefficient.— Based  on  a  sample  of  502  unclassified 
aviation  students  tested  from  August  10  to  October  6,  1944,  a  correla¬ 
tion  of  0.78  (corrected  for  length)  was  secured  between  part  I  and  part 
II  of  this  test.  Since  the  two  parts  are  not  completely  comparable,  this 
figure  can  be  considered  only  as  a  very  rough  estimate  of  the  reliability. 

(3)  Difficulty.— Based  upon  the  responses  of  the  above-mentioned 
sample  of  750  unclassified  aviation  students,  Parts  I  and  II  of  the  test 
yielded  the  mean  proportions  of  correct  responses  given  in  table  19.11. 


Table  19.11.—  Difficulty  of  items  for  Flight  Orientation,  CPS28A 


— 

M., 

SD„ 

Kangc  of  diRcuItjr 

Part 

Low 

High 

(  . 

061 

0.17 

0.16 

.10 

O.M 

.91 

11  . 

.34 

.19 
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FIGURE  19.7 

SECTION  OF  A?  $WiR  SHEET  OF  STICK  ! 

AND  RUDDER  ORIENTATION  CPS.fl  A 

to  a  maneuver  or  a  completed  maneuver.  For  example,  problem  1  shows  an  airplane 
banking  to  the  right.  Your  task  is  to  record  on  the  special  answer  sheet  the  positions 
to  which  the  stick  and  rudder  would  be  moved  to  perform  each  maneuver. 

For  the  problems  in  this  test  the  stick  and  rudder  are  moved  as  follows : 

For  banks  only— To  bank  left,  move  stick  to  left.  To  bank  right,  move  stick  to 
right 

For  banked  turns— To  turn  left,  move  both  stick  and  rudder  to  left.  To  turn  right,  j 

move  both  stick  and  rudder  to  the  right 

For  climbs  and  dives— To  dive,  push  stick  forward.  To  climb  pull  stick  back.  j 

On  your  answer  sheet  indicate  stick  and  rudder  positions  according  to  the  diagram 
of  the  answer  sheet  and  accompanying  key.  (See  fig.  19.6.) 

Reading  downward,  the  series  of  three  cockpit  views  would  be  seen  from  an  air* 
plane  banking  to  the  right.  To  maneuver  the  airplane  in  the  direction  the  stick  would 
be  moved  to  the  right.  Therefore,  right  stick  is  correctly  marked  on  the  sample 
answer  sheet 

(3)  Scoring. — The  scoring  formula  used  is  R— W/5. 

Statistical  results. — None  are  available. 

Discrimination  Reaction  Time  (Paper),  CP631A  u 

The  economy  provided  by  printed  tests  administered  lo  large  groups 
is  so  great,  compared  with  the  more  difficult  and  troublesome  adminis* 
trative  requirements  of  apparatus  tests,  that  there  was  spasmodic  inter¬ 
est  in  the  poss:bility  of  duplicating  valid  apparatus  tests  in  the  simpler 
printed  form. 

11  D<«<hH  Xotttik  Unit  N*  I.  CVrf  oninWni:  SuC/Sft.  Wijtm  S. 
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Early  in  the  program,  little  success  was  realized  in  duplicating  the 
(unctions  of  apparatus  tests  in  the  form  of  printed  tests.  This  was  par¬ 
tially  due  to  the  fact  that  the  factorial  content  of  apparatus  tests  used  in 
the  classification  batteries  was  not  well  known.  It  had  been  assumed 
that  the  validity  of  Complex  Coordination,  CM701A,  Two-Hand  Coor¬ 
dination,  CM101A,  Discrimination  Reactior.  Time,  CP611D,  and  other 
psychomotor  tests  included  in  the  classification  battery  was  due  primarily 
to  the  motor-coordination  elements  measured.  Consequently,  attempts 
were  made  to  maintain  the  conspicuous  motor  asjiects  of  tests  when 
printed  material  was  employed.  This  naturally  presented  obstacles  that 
were  not  easy  to  overcome.  But  after  it  was  found  that  several  appara¬ 
tus  tests  were  highly  correlated  with  some  printed  tests,  it  was  discovered 
that  they  actually  measured  spatial-relations,  perceptual-speed,  and  vis¬ 
ualization  abilities  as  well  as  motor  abilities.  The  result  was  encouraging, 
and  the  way  was  clear  to  attempt  the  desired  duplications  by  means  of 
printed  tests. 

Description. — Discrimination  Reaction  Time,  CP634A,  presents,  on 
paper,  patterns  similar  to  those  shown  with  red  and  green  lights  in  the 
apparatus  Discrimination  Reaction  Time  test.1*  Black  and  white  circles 
take  the  place  of  the  colored  lights.  The  examinee  responds  to  the 
stimuli  by  marking  in  one  of  the  four  directions — up,  down,  left,  or 
tight — on  a  specially  designed  answer  sheet.  Each  of  the  four  directions 
for  marking  corresponds  to  the  direction  that  the  examinee  would  move 
in  order  to  snap  one  of  the  four  switches  on  the  discrimination  reaction 
time  apparatus. 

Parts  I  and  II  present  patterns  almost  identical  with  the  light  patterns 
in  the  apparatus  test.  Parts  Ilf  and  IV  call  for  a  response  to  slightly 
more  complex  patterns.  The  stimulus  is  the  arrangement  of  three  circles, 
one  white,  one  black,  and  one  with  a  cross.  The  response  is  made  in  one 
of  the  same  four  directions — up,  down,  left,  or  right. 

(1)  Internal  characteristics. — Tiic  directions  for  parts  I  and  II  in¬ 
clude  four  unrecorded  sample  items  and  five  recorded  but  unscored  sam¬ 
ple  items.  Part  I  contains  45  scored  items,  and  part  II  contains  50  scored 
items.  The  directions  for  parts  III  and  IV  contain  five  recorded,  but 
unscored,  sample  items.  Part  III  contains  45  scored  items,  and  part  IV 
contains  50  scored  items.  Ail  of  the  items  in  each  part  arc  included  on 
a  single  page.  Elach  item  is  located  on  the  page  in  the  position  that  corre¬ 
sponds  to  the  position  of  its  item  number  on  the  answer  sheet.  The  items 
in  pans  III  and  IV,  calling  for  a  response  to  a  three-circle  pattern  in¬ 
stead  of  a  two-circle  pattern,  are  more  difficult  than  those  in  parts  I 
and  II. 

(2)  Administration. — Answers  are  marked  directly  on  the  special, 
overprinted,  answer  sheet  described  later.  One  minute  and  15  seconds 
are  allowed  for  part  I,  1  minute  for  part  II,  1  minute  and  30  seconds 

"Ft  *  btirt  drtrnpl**  ol  tku  tr«t  m  *04.  Ft  a  d»*r.p*voa  irt  Report  Ho.  «. 
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(or  part  III,  and  I  minute  and  15  seconds  (or  part  IV.  Administration 
of  directions  and  sample  items  takes  5  minutes,  making  a  total  testing 
time  of  approximately  10  minutes. 

Examinees  are  instructed  to  make  only  a  small  check,  short  line,  or 
other  quick  mark  to  indicate  their  answers.  They  are  told  that  time  will 
be  allowed  later  to  (ill  in  the  spaces  completely.  At  the  end  of  parts  I  and 
II  they  are  instructed  to  close  their  test  booklets  and  to  go  back  and  fill 
in  the  answer  spaces  for  both  completed  parts.  At  the  end  of  the  test 
the  same  instructions  are  again  given  for  part*  HI  and  IV.  This  feature 
was  later  discarded. 

Correctly  marked  sample  items  for  parts  I  and  II  are  shown  in  figure 
19.8.  Following  is  part  of  the  directions  for  parts  I  and  II : 


figure  :9.® 

SAMPLE  PROBLEMS  Or  DISCRIMINATION 
REACTION  TIME  C*a; -:*)  -  CP6J4A  - 
PARTS  I  &  XL  AND  CORRECTLY  MARKED 
ANSWER  SHEETS 

This  is  a  test  of  speee!  of  reaction  to  a  signal.  The  signal  "ill  be  an  arrangement 
of  a  black  amt  a  "bile  circle.  There  are  only  four  arrangements  of  the  circles,  and 
fou,  "ays  to  mark  jour  answer  sluet.  I.oi,k  at  the  sample  problem  below  and  tlx 
>  or.  .->|Km<Iiug  illustrations  of  the  ce.rcet  "ays  to  mark  your  answer  sheet 

A.  When  the  white  circle  is  bslow  the  black  circle,  mark  the  lower  space  i* 
the  cross. 

li.  When  the  white  circle  is  above  the  black  circle,  mark  the  upper  space  in  the 
cross. 

L  When  the  white  circle  i«  to  ihc  left  of  tl.c  black  circle,  mark  the  space  to 
the  left. 
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I).  When  (He  white  circle  is  to  the  right  of  the  black  circle,  mark  the  space  to 

the  right 

following  is  a  part  of  the  directions  for  parts  III  and  IV : 

In  this  part  of  the  test  the  signal  will  he  the  arrangement  of  three  circles  imte;.H 
of  two.  (Set  fig.  19.9.) 


FIGURE  19.9 

sample  problems  of  discrimination 

REACTION  TIME  OapciU,  CP634A 
PARTS  K  &IX 

1.  If  the  black  circle  is  on  the  outside,  mark  in  its  direction. 

2.  If  the  black  circle  is  in  the  center,  mark  in  the  direction  of  the  white  circle. 
Remember : 

Mack  outside,  mark  toward  black. 

Black  center,  mark  toward  white. 

Work  sample  items  1  through  5. 

The  correct  answers  should  be  marked  as  follows: 

Item  I,  upper  space. 

Item  2^  right  space. 

Item  3,  lower  space. 

Item  4,  right  space. 

Item  S,  lower  space. 

(3)  Scoring. — The  j  oring  formula  used  is  R  — W/3. 

Statistical  results. — None  are  available. 
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Directional  Marking,  CP533A  “ 

This  test  was  designed  to  make  a  decisive  examination  of  the  hypoth¬ 
esis  that  the  space  I  factor  is  a  directional-discrimination  ability.  There 
was  an  attempt,  in  construction,  to  include  movements  in  three  dimen¬ 
sions.  On  a  flat  sheet  of  paper,  such  as  an  answer  sheet,  however,  only 
two-dimensional  movement  is  possible.  Up  and  down  and  left  and  right 
were  selected  as  the  descriptive  terms  in  these  two  dimensions.  Since 
the  third  dimension  of  depth  could  not  be  used,  the  closest  substitute 
suggested  was  to  use  the  terms  near  and  far,  which  are  ordinarity  de¬ 
scriptive  of  depth,  but  which  were  made  to  apply  to  the  two-dimensional 
surface  as  limitations  on  the  movements  in  the  flat-surface  dimensions. 

Description. — Each  item  consists  of  four,  verbally  stated  directions  of 
movement,  and  each  of  these  four  statements  describes  the  position  of 
an  answer  space  that  is  within  a  square,  printed  on  the  answer  sheet, 
containing  25  answer  spaces.  The  center  space  in  each  square  or  box  is 
covered  by  a  solid  black  circular  spot,  which  is  designated  as  the  starting 
point  for  each  of  the  responses  to  items.  The  examinee's  task  is  to  place 
four  marks  in  each  box  at  the  distances  and  directions  from  the  center 
circle  that  are  described  in  the  four  verbally  stated  problems.  Figure 
19.10  shows  one  correctly  marked  sample  item  of  the  test. 


SAMPLE  BOX 


SAMPLE  ITEMS  WITH  CORRECTLY 
MARKED  ANSWERS  FOR  DIRECTIONAL 
MARKING,  CP533A 

(1)  Internal  characteristics. — All  of  the  answers  are  marked  on  one 
>ide  of  an  answer  sheet.  Since  tlu-re  are  2?  answer  spaces  inclosed  within 
each  item  box  and  750  answer  spices  on  one  side  of  an  answer  sheet, 
there  are  .k)  item  Im.xes.  One  of  these  is  used  for  the  sample  problem, 
14  arc  used  for  part  I  and  tin*  remaining  15  are  used  for  part  II.  Tims, 
the  directions  contain  4  nvorded  but  uiiwored  answers,  part  I  contains 
56  recorded  and  scored  answers,  and  part  II  contains  M)  scored  answers. 

“  DtnUfrd  tl  r<i(M*(tol  Itinftk  I’nit  S«  J.  Tnl./S|l.  C 
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(2)  Administration. — Three  min.  .  s  are  allowed  for  recording  an¬ 
swers  in  part  I,  and  3  minutes  and  15  seconds  arc  allowed  in  part  II. 
Administration  of  directions  and  explanation  and  recording  of  the  s.mplc 
problem  takes  5  minutes,  making  a  total  testing  time  of  11  minute--. 

Following  is  part  of  the  directions: 

Each  item  in  this  teM  consists  of  four  jimb'cms ;  each  problem  requires  you  t>i 
place  a  mark  in  a  box  •  «  •  The  |t..x  appears  on  your  separate  answer  sheet 
and  has  a  dot  in  the  center  surrounded  hy  spaces  for  the  marks. 

Your  task  is  to  place  the  four  marks  ui  the  distances  and  directions  from  the  dot 
which  arc  indicated  by  the  following  instructions: 

1.  Near  right  means  one  space  from  the  dot  to  the  right;  near  left,  one  space 

to  the  left. 

2.  Far  right  means  two  spares  from  the  dot  to  the  right;  far  left,  two  spaces 

to  the  left. 

3.  Near  up  means  one  space  up  from  the  dot;  near  down,  one  space  down. 

4.  Far  up  means  two  spares  up  from  the  dot;  far  down,  two  spaces  down. 

(3)  Scoring. — The  scoring  formula  used  is  R  — \V/4. 

Statistical  results. — None  arc  available. 

Evaluation  of  Directional  Discrimination  Toots 

The  question  of  whether  there  is  a  single  factor  in  common  to  all  tests 
in  this  group  is  still  unanswered,  although  evidence  suggesting  such  a 
conclusion  is  considerable.  The  study  of  tests  in  this  area  has  been  par¬ 
ticularly  challenging,  (1)  because  the  spatial  factor  is  highly  valid  for 
both  pilot  and  navigator  selection,  and  (2)  because  it  has  been  difficult 
to  develop  a  pure  measure  of  the  factor  \ro  test  constructed  as  yet  has 
demonstrated  a  loading  substantially  greater  than  0.50  for  the  spatial- 
relations  factor.  The  validity  of  thi<  factor  for  all  three  air-crew  assign¬ 
ments  justifies  maximal  effort  to  enlarge  this  loading  and  to  purify  its 
measuring  instrument. 

If  the  hypothesis  that  a  direction  il-di-criminational  ability  is  the  irn- 
|»ortant  a>i»cct  of  tire  spatial-relations  factor,  then  we  should  expect 
Directional  Marking  or  Discrimination  Reaction  Time  (pajxrr)  to  show 
the  greatest  purity.  Directional  Marl  ing  is  crucial  for  the  hypothesis  be¬ 
cause  it  removes  the  element  of  visually  j>crceived  spatial  arrangements 
from  consideration  bv  presenting  the  stimuli  vcrkally. 

POSITIONAL  DISCRIMINATION  TESTS 

The  latest  factor  study  referred  to  in  the  introduction  to  this  chapter 
proved  to  lie  important  not  only  because  it  demonstrated  the  spatial  na¬ 
ture  of  the  factor  that  had  been  in  yueslion,  but  also  Ixxausc  it  jrointod 
to  the  existence  of  a  second  spatial  factor.  That  this  second  factor  had  not 
emerged  until  this  particular  analysis  is  attributable  to  the  fact  that  the 
Hands  test  had  not  been  incorporated  in  any  of  the  previously  studied 
batteries.  Until  its  inclusion,  presumably,  other  tests  that  appreciably 
measured  the  ability  had  Seen  too  few  or  too  weak  in  the  ability  to  make 
its  presence  known  Other  tests  having  this  factor  in  common  arc  Flairs. 
Figures  and  Cards.  CP5I2A,  and  Cubes,  CP5i2A. 
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Several  tentative  names  were  proposed  for  this  variable.  These  in^ 
eluded:  i.ands"  space,  rotational  space,  spatial  empathy,  and  positional 
discrimination.  The  term  ‘‘spatial  empathy"  was  suggested  because  in¬ 
formal  introspective  reports  of  some  psychologists  indicated  that  in  solv¬ 
ing  the  items  one  may  "project"  himself  into  the  test  objects  or  into  a 
more  favorable  posture  from  which  to  judge  the  positions  of  the  objects. 

The  following  tests  are  designated  “Positional  Discrimination"  tests 
and  are  treated  together  as  the  second  subarea  of  tests  of  a  spatial  chap¬ 
ter.  The  three  tests  described  arc  revisions  of  those  mentioned  above 
and  are  named  Positional  Orientation.  CP526A,  Object  Identification, 
CP521A,  and  Object  Recognition,  CP523A.  They  arc  discussed  in  the 
order  of  their  development. 

Object  Identification,  CPS21A  14 

This  is  a  revision  of  Thurstonc’s  Mags  test.  It  was  adapted  for  tlie 
purpose  of  measuring  and  studying  the  hypothesized  ability  to  manipu¬ 
late  images  in  space.  This  revision  was  further  motivated  by  the  promis¬ 
ing  validity  of  Thurstonc’s  Mags  test  which  had  !h*ch  adapted  by  the 
AAF  in  a  systematic  study  of  perceptual  tests.  Face  validity  was  incor¬ 
porated  into  one  section  of  the  revised  form  of  the  test  by  using,  in¬ 
stead  of  flags,  silhouettes  of  military  vehicles. 

Description. — Silhouettes  of  planes,  trucks,  guns,  tanks,  and  ships  are 
presented  in  part  I,  and  flags  are  used  in  part  II.  The  examinee's  task 
is  to  select,  from  five  illustrations,  those  that  show  the  same  side  of  an 
object  as  that  shown  in  a  key  illustration.  Some  are  turned  over  and  are, 
therefore,  incorrect  answers. 

(1)  Internal  characteristics. — The  directions  contain  two  recorded  but 
unscored  sample  items.  Part  I  contains  28,  and  part  II  30  scored  items 
presented  in  line  drawings. 

(2'  Administration. — Directions  consume  3  minutes,  and  7  and  6  min¬ 
utes  are  allowed  to  complete  parts  I  and  II  respectively,  nuking  a  total 
testing  time  of  16  minutes. 

One  sample  item  from  jurt  I  of  the  test  is  shown  in  figure  19.11.  Fol¬ 
lowing  are  part  of  the  directions: 

\  /  l 

I  ABODE 


FIGURE  19. 1 1 

SAMPLE  ITEM  OF  OBJECT  IDENTIFICATION, 

CP  5c  I A 

This  is  a  icsl  of  )our  ability  to  identify  objects  in  silhouette. 

You  will  v.c  rows  of  ms  silhouettes.  Your  task  will  be  to  compare  the  five  silhou¬ 
ettes  labeled  A,  R,  C.  I),  and  K  with  the  first  silhouette  in  each  row. 


**  D*T«Uped  >i  hfcbtltiKil  Itmtik  v/mt  St.  ).  Cli«l  (ttlnkmoi:  L***  G.  WfijbC 
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Sonic  of  the  five  silhouettes  arc  the  same  as  the  first  one  in  the  row,  but  have 
been  slit!  around  into  different  jtositions. 

Others  of  tile  five  silhouettes  are  different  from  the  first  one  in  the  row;  that  i >. 
they  have  been  turned  over  and  could  not  be  made  to  fit  the  first  silhouette  by 
simply  sliding  them  around  on  the  page. 

Your  problem  will  be  to  decide  which  of  the  five  silhouettes  in  eacli  row  match 
the  silhouette  on  the  left. 

I’lanc  silhouettes  A,  15,  and  C  arc  the  same  as  silhouette  number  1.  Therefore  A, 
B,  and  C  arc  the  correct  answers  for  item  number  1. 

(3)  Scoring. — The  scoring  formula  used  is  R  — W+IOO. 

Statistical  results. — L’nless  specifically  noted  to  the  contrary,  the  data 
below  arc  for  examinees  tested  at  Psychological  Research  L'nit  No.  3. 

(1)  Distribution  statistics. — Typical  examples  of  distribution  statis¬ 
tics  arc  given  in  table  19.12.  The  distribution  curves  are  considerably 
negatively  skewed  and  somewhat  flatter  than  normal. 


Tabi.k  19.12. —  Distribution  constants  for  Object  Identification,  CP521A  based  upon 

muffles  of  classified  filots 


N 

M 

SD 

'282 

184.7 

21.4 

>1 ,222 

186.7 

24.0 

1  In  classes  44G  and  44U. 
*  In  clan  44H. 


(2)  Reliability  coefficient. — Three  samples  yielded  the  estimates  of 
reliability  given  in  table  19.13.  Table  3.1  presents  additional  reliability 
data. 


Table  19.13. —  Alternate-forms  (fart  I  v.  fart  II)  reliability  coefficients  for 
Object  Identification,  CP521A 


Group 

N 

rn 

Pilots1  . „ . . . 

348 

’0.66 

0.80 

Unclassified  aviation  Mu<'ents*  . 

*65 

.79 

Utulassiftcd  aviation  >tu*»cnti*  . a.,* . 

Unclassified  aviation  student*  plus 

500 

‘.60 

,7S 

airplane  mechanics’  . | 
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i 

‘.64 

.78 

k  1«  cla<*  44 II. 

*  Part  II  administer c«l  immediately  after  part  I. 

1  Tested  at  Medical  and  INychnlngical  Kxan.inim;  Unit*  Nos.  6  and  8  early  in  194 S. 
4!*ait  II  admini'lctcd  approximately  4  hours  after  part  I. 


Table  19.14. —  t  orrelation  between  rights  and  wronys  for  Object  Identification, 


CP521A 


(•roup 

N 

Part 

Ytr 

Unclassified  aviation  students1 . 

*500 

|  . 

-0.28 

l»o'  . 

500 

11  .  ,  .  t 

—  .22 

'■dots’  . 

i.so 

I  .ind  It  . 

-.40 

1  >57 

| . 

—  .28 

IM  . 

I.J57 

II  . 

-.28 

l)o‘  . 

1,257 

|  and  U . 

—  ,3U 

’Tested  early  in  I'M  t.  K.iacl  testing  dates  not  identified. 

*  Tested  in  i  illy,  1944. 

’Tested  at  Psychological  Research  Unit  Nn.  I  in  Tune  1914;  at  Psychological  Research  Unit 
No.  2  in  April,  1944;  and  at  Psychological  Kcse.ycb  Unit  No.  3  in  March  1944. 
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(3)  Correlation  between  rights  and  wrongs. — The  data  are  shown  in 
table  19.14. 

(4)  Factorial  composition. — The  most  significant  loadings  for  Thur* 
stone’s  tests.  Flags,  Figures,  and  Cards,  CP512A,  the  earlier  version  of 
Object  Identification,  CP521A,  are  in  the  spatial-relations  (0.43),  space 
rI  (0.42),  and  |>erccptual-specd  (0.31)  factors.  The  coniinunality  is  0.54 
which  almost  equals  the  test  reliability.  For  a  fuller  picture  of  the  fac¬ 
torial  composition  of  this  test  see  Appendix  B. 

(5)  Test  validity. — Validation  results  based  on  several  samples  are 
given  in  table  19.15. 

(6)  Item  validity, — V  alidation  of  items  revealed  a  mean  phi  of  0.08, 
based  upon  the  responses  of  000  graduates  and  93  eliminees  from  pri¬ 
mary  pilot  training  in  class  441.  The  standard  deviation  of  phi  values 
was  0.08  and  the  range  was  from  —0.12  to  0.23. 

Ex'aluation. — Flags,  Figures,  and  Cards,  the  test  from  which  Object 
identification  was  derived,  was  factor  analyzed,  but  the  derived  form  was 
not.  Forty-six  percent  of  the  total  variance  of  Flags  is  attributable  to 
three  factors.  Eighteen  percent  of  the  total  variance  is  found  m  the 
spatial-relations  factor,  IS  jxrcent  in  the  space  II,  and  10  percent  in  the 
perceptual-speed  factor.  The  known  factors  in  the  Flags  test  exactly  ac¬ 
count  for  its  average  pilot  validity  of  0.24,  which  allows  a  small  loading 
of  0.05  for  the  factor  space  II  in  the  pilot  criterion  (see  table  28.17). 

Part  II  of  Object  Identification  showed  somewhat  higher  pilot  validity 
than  part  I  in  one  sample  but  not  in  another.  It  would  be  of  interest  to 
treat  parts  I  and  II  as  two  separate  variables  when  factor  analyzing 
them. 

The  substantial  navigator  validity  calls  for  investigation  to  determine 
its  source. 

Object  Recognition,  CP523A  14 

This  is  a  revision  of  Thurstone's  Cubes  test.  It  was  adapted  for  tlie 
purpose  of  obtaining  another  measure  of  the  hypothesized  ability  to 
manipulate  images  in  space.  Solving  problems  in  this  test  seemed  to  call 
for  visualization  in  three  dimensions  from  presentation  in  only  two  di¬ 
mensions.  In  Thurstone’s  factor  analyses.  Cubes  appeared  heavily  satu¬ 
rated  with  his  space  factor  (factor  S),  a  factor  that,  if  we  arc  to  accept 
Thurstone’s  description,  seems  more  closely  to  resemble  visualization.  A 
reanalysis  of  Thurstone’s  data  showed  a  rather  clear  separation  of  space 
and  visualization. 

For  the  purpose  of  adding  face  validity  to  the  items,  insignia  of  the 
United  States  Army  were  substituted  for  the  nonsense  symbols  of  Thur¬ 
stone's  cubes. 

Description.-— An  item  consists  of  an  illustration  of  a  single  cube  (key 
cube)  in  a  row  with  five  other  (alternate)  cubes.  On  each  of  the  six 

i»  Developed  at  Psychological  Research  Unit  No.  3.  Chief  contributor*:  CpI.  Albert  A.  Can- 
field  Jr.  and  Staff/Sgt.  Benjamin  Fruchter, 
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sides  of  a  cube  are  different  military  insignia,  but  only  three  sides  show 
in  the  illustrations.  The  examinee’s  task  is  to  determine  whether  or  not 
each  of  the  five  alternate  cubes  could  represent  the  key  cube;  that  is, 
having  every  side  with  the  same  insignia  in  proper  relationships.  The 
alternate  cubes  are  cither  rotated  or  turned  from  the  original  position, 
have  different  insignia,  or  have  the  insignia  placed  in  different  relation¬ 
ships  than  on  the  key  cube. 

(1)  Internal  characteristics. — The  directions  contain  one  sample  row 
of  cubes,  and  parts  I  and  II  each  contain  10  key  cubes.  The  sample  re¬ 
quires  5  judgments  and  each  part  requires  50  judgments,  5  for  each 
key  cube. 

(2)  Administration. — Two  minutes  arc  allowed  for  administration  of 
the  directions  and  sample  items,  13  minutes  for  part  I,  and  12  minutes 
for  part  II,  making  a  total  testing  period  of  27  minutes. 

A  sample  is  shown  in  figure  19.12.  Following  are  pait  of  the  direc¬ 
tions  : 


s  a  a  c  o  c 


FIGURE  19.12 

sample  item  of  object  recognition, 

CP523A 

This  is  a  test  of  your  ability  to  visualize  change  of  position.  You  will  be  shown 
drawings  of  cubes.  Each  cube  has  six  sides,  and  each  side  has  a  different  military 
insignia.  Look  at  the  sample  problem. 

The  cube  at  the  far  left  is  the  key  cube.  Your  task  is  to  select  from  the  five  cubes 
at  the  right  the  ones  that  could  represent  the  key  cube  turned  to  a  different  position 

Cubes  C  and  D  arc  correct  answers.  Both  could  be  the  key  cube  turned  to  a  differ¬ 
ent  position. 

(3)  Scoring. — The  a  priori  scoring  formula  is  R— W.  Most  of  the 
statistics  reported  on  the  test,  however,  are  for  rights  and  wrongs  scored 
separately. 

Statistical  results. — The  available  data  all  arc  based  upon  examinees 
tested  at  Psychological  Research  Unit  Xo.  3. 

(1)  Distribution  statistics. — Typical  examples  of  distribution  statis¬ 
tics  are  given  in  table  19.16.  The  distribution  curves  for  rights  scores 
are  moderately  negatively  skewed  and  considerably  flatter  than  normal, 
and  for  wrong  scores  are  positively  skewed  and  somewhat  flatter  than 
normal. 

(2)  Internal  consistency. — The  degree  of  homogeneity  of  the  items 
is  indicated  by  a  mean  internal-consistency  phi  (based  ujx)ii  the  total 
group)  of  0.52,  a  standard  deviation  of  the  phi  distribution  of  0.14,  and 
a  range  of  values  from  0.05  to  0.84.  These  statistics  are  based  upon  the 
responses  of  the  highest  27  percent  and  the  lowest  27  percent  in  total 
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Tabi.k  19.16. —  Distribution  constants  for  Object  Recognition,  CP523A 


Croup 

Patl 

Scoring 

N 

M 

SD 

|  and  It  . 

>1.130 

20.9 

7  * 

t>0  .  a  .  ' . 

>1.110 

11.5 

5.4 

I  .  .  . 

boo 

16.4 

4.3 

It  . 

night*  . 

>500 

11.9 

4.3 

Right*  . 

>500 

29.2 

7.9 

1)0  . 

.  . 

Wrong* . 

>500 

II. 1 

5.5 

•  In  clau  441. 

*  T»»Iin(  date*  unidentified. 


score  of  a  group  of  750  unclassified  aviation  students  tested  in  May  1944. 
Contrary  to  usual  practice,  it  will  be  noted,  statistics  arc  based  on  “total 
group."  The  reason  for  this  is  that  total  number  of  attempts  for  any  one 
cube  cannot  be  determined,  since  the  examinee  rcs|>onfls  to  each  cube  by 
marking  or  not  marking  each  appropriate  answer-space. 

(3)  Reliability  coefficient. — One  sample  yielded  the  estimates  of  reli¬ 
ability  given  in  table  19.17. 

(4)  Correlation  between  rights  and  wrongs. — For  a  sample  of  849 
pilots  tested  in  the  period  from  May  9  to  July  10,  1944,  the  correlation 
between  correct  and  incorrect  resjionses  was  —0.22. 

Tahi.e  1 9. 1 7.—.-tlU-riMte-foriui  (furl  t  vs.  fart  It)  reliability  coefficients  for 

Object  Recognition,  CP 523 A,  based  upon  a  sample  of  500  unclassified  aviation 

students* 


Score 

•'u 

ri# 

Right.  . 

Wrong*  . 

0.72 

0.M 

.62 

.76 

1  Tetling  dales  unidentified. 

(5)  Difficulty. — Based  ujkm  the  rcsjxmscs  of  the  above-mentioned 
sample  of  750  unclassified  aviation  students,  the  test  yielded  a  mean  pro¬ 
portion  of  correct  responses  of  0.09,  corrected  for  chance,  with  a  range 
from  0.50  to  0.86  and  a  standard  deviation  of  0.10. 

(6)  Factorial  composition. — The  most  significant  loadings  for  Tliur- 
s tone’s  CuIk-s,  CP5J2.A,  the  earlier  form  of  object  recognition,  were 
found  in  the  spatial-relations  (0.41),  percept  uni -speed  (0.31).  general- 
reasoning  (0.26),  space  II  (0.25).  and  visualization  (0.20)  factors.  The 
commonality  was  0.53  to  be  compared  with  a  reliability  of  0.68.  For  a 
fuller  picture  of  the  factorial  comjxxsition  of  this  test  sec  Appendix  B. 

(7)  Test  validity. — Validation  results  arc  given  in  tabic  19.18. 

Taiu  k  19.18. —  t’alidity  data  for  Object  Recognition,  CP523A,  based  upon 
graduation-elimination  of  pilots  in  primary  training 


Seor* 

SO, 

Si# 

Right*  . 

>1.110 

0.86 

29.28 

27.16 

7.51 

0.IS 

Wrong*  . 

>1.1)0 

.86 

11-36 

12.71 

5.4$ 

-.11 

Right*  . 

>849 

.71 

11.67 

29.41 

7.19 

.17 

Wrong*  . 

•849 

.78 

10  BJ 

10  87 

5  15 

-.01 

R-\V  . 

•849 

.78 

20  8  4 

18.54 

10.11 

.11 

•  Attuming  an  unirMncinl  tlaninc  ttamlard  deviation  of  2. 00. 

*  In  dot*  441;  Intrd  in  Marti)  1944. 

•Clan  unidentified.  Sample  letted  Irom  May  9  to  July  1C,  1914. 
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(8)  Item  validity. — Validation  of  items  re  waled  a  mean  phi  of  0.02. 
based  upon  the  responses  of  6 87  graduates  and  112  dimmers  from  pri¬ 
mary  training  in  class  441.  The  standard  deviation  of  phi  values  was 
C.OS  and  the  range  was  from  —0.08  to  0.31. 

Evaluation. — Cubes,  the  predecessor  of  Object  Recognition,  was  sub¬ 
jected  to  factor  analysis,  but  the  adapted  form  was  not.  The  factorial 
picture  is  comparatively  complex.  The  greatest  jxjrtioti  of  the  total  vari¬ 
ance  of  the  test  on  a  single  factor  is  17  percent,  which  is  attributable  to 
the  spatial-relations  factor.  Other  factors  showing  some  inlluencc  arc 
perceptual  speed  with  10  percent,  general  reasoning  with  7  percent, 
space  II  with  6  percent,  and  visualization  with  4  |x.-rcent  of  the  total 
\ariance.  These  factors  almost  exactly  account  for  the  pilot  validity  of 
Uus  test 

Both  Cubes  and  Object  Recognition  showed  moderate  validity  for  pitots, 
but  they  failed  to  measure  any  function  of  known  validity  as  well  as 
other  tests  already  in  the  classification  battery.  It  may  be  expected  that 
Object  Recognition,  when  factor  analyzed,  will  show  even  more  percep¬ 
tual  content  than  Cubes  due  to  the  fact  that  identifying  and  cornering 
military  insignia,  such  as  those  illustrated,  is  perceptually  more  difficult 
than  the  simpler  and  more  obvious  type  of  symtiol  used  in  Thurstone's 
Cubes  test. 

Position  Orientation,  CP526A  14 

This  is  an  adaption  of  Thurstone’s  Hands  lest.  The  validity  of  Hands 
on  a  sample  of  927  pilots  was  0.26.  On  the  same  sample  its  correlation 
with  the  pilot  staninc  was  only  0.17,  which  indicated  that  the  test  has 
substantial  unique  variance  to  offer.  A  revision  was  planned  which 
would  be  more  reliable  and  more  adaptable  to  the  group-testing  pro¬ 
cedures  used  in  the  AAF.  The  revision  was  also  to  include  other  right- 
left  members  of  the  body  as  well  as  hands.  Validation  on  new  samples 
was  proposed  in  order  to  verify  the  validity  of  the  Thurstonc  test. 

Description. — Each  item  shows  five  drawings  of  right  or  left  hands, 
arms,  legs,  eyes,  or  feet.  The  examinee’s  task  is  to  determine  quickly 
whether  a  drawing  represents  a  right  or  a  left  iiiemlier  of  the  body. 
Parts  I  and  II  of  the  test  show  only  hands  while  parts  III  and  IV  show 
hands,  arms,  legs,  and  feet. 

(1)  Internal  characteristics. — The  directions  tor  pari  I  contain  four 
sample  items.  Each  item  has  five  illustrations,  each  calling  for  a  response, 
making  a  total  of  20  possible  answers.  Part  I  contains  26  items,  making 
a  total  of  130  recorded  and  scored  answers;  part  II  calls  for  150  an¬ 
swers,  part  III  for  180,  and  |>art  IV  for  150.  Directions  and  sample 
items  for  part  I  require  five  minutes,  and  the  directions  and  four  sample 
items  for  parts  III  and  IN’  require  an  additional  3  minutes.  1  he  testing 

'« O'V'loiw'l  II  r»rchol0Kic»l  Rebirth  Unit  Xo.  J.  <ontiibutor»:  Cpl.  J*m*»  B.  Ttt- 

ju'on  anil  la.  t-ron  1.  Hr  liman. 
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lime  (or  parts  I  and  II  is  7  minutes  each,  and  for  parts  III  and  IV  7j/i 
minutes  each,  making  a  total  testing  time  of  approximately  37  minutes. 

A  sample  item  front  part  I  is  shown  in  figure  19.13.  hollowing  are 
parts  of  the  directions: 
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In  this  test  you  will  be  shown  a  scries  of  left  and  'ight  hands  in  various  positions. 
Your  problem  is  to  determine  which  of  these  are  left  hands  and  which  are  right 
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hands.  If  the  hand  shown  is  a  left  hand,  mark  your  answer  in  the  answer  sheet 
column  labeled  left.  If  it  is  a  right  hand,  mark  your  answer  in  the  answer  sheet 
column  labeled  right.  (See  the  illustration  of  the  special  answer  sheet  in  fig.  19,14.) 


SECTION  OF  ANSWER  SHEET  OF  POSITION 
ORIENTATION  CP526A 

For  example,  look  at  problem  1.  The  hand  shown  in  picture  A  is  a  right  hand. 
Therefore,  mark  the  space  under  A  in  the  column  headed  right  opposite  problem  1 
on  your  answer  sheet.  The  hand  in  picture  B  is  also  a  right  hand,  so  the  space  under 
B  must  be  marked  in  the  column  headed  right 

(3)  Scoring. — The  scoring  formula  is  R— W. 

Statistical  results. — The  data  arc  fairly  complete  on  this  test  and  are 
based  upon  examinees  tested  at  Psychological  Research  Unit  No.  3. 

(1)  Distribution  statistics. — Typical  examples  of  distributions  are 
given  in  table  19.19.  The  distribution  curves  arc  approximately  symmetri¬ 
cal  and  considerably  flatter  than  normal. 

Table  19.19. —  Distribution  constants  for  Position  Orientation,  CP526A,  based  upon 
a  sample  of  578  unclassified  aviation  student / 


P»rt» 

U 

SD 

1  and  11  . 

187.0 

4S.S 

Ill  and  IV  . 

164.6 

46.8 

'  Tcitcd  in  October  1944. 

(2)  Reliability  coefficient. — By  correlating  part  I  and  part  II,  an  esti¬ 
mated  reliability  coefficient  of  0,83,  corrected  for  length,  was  obtained. 
This  figure  is  based  on  a  sample  of  500  unclassified  aviation  students 
tested  from  August  10  to  October  6,  1944. 

(3)  Factorial  conif>osilion. — The  only  substantial  loading  (0.46)  for 
the  test  Hands,  CP512A,  the  earlier  version  of  Position  Orientation,  is 
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on  the  space  Ii  factor.  The  commonality  is  0.35.  For  a  fuller  picture  of 
the  factorial  composition  of  this  test  sec  Appendix  B. 

(4)  Test  validity. — Validation  results  based  on  several  samples  are 

given  in  table  19.20. 


Tabi.c  19.20. —  Validity  data  for  Position  Orientation,  CPS26A,  based  upon 
graduation-elimination  of  pilots  in  primary  training 


I’srt 

Scoring 

formula 

•V 

M, 

*». 

SO, 

1  . 

Rights 

8  IT 

0.80 

91.28 

88.41 

2204 

0.13 

018 

II  . 

8  V 

.80 

98.63 

95.19 

25.91 

.08 

.15 

Ill  . 

. . . .do. . . . 

817 

.80 

87.04 

84.01 

22.11 

.08 

.11 

IV . 

JII7 

.80 

91.95 

89.27 

24.41 

.04 

.10 

1  . 

Wrongs 

847 

.80 

4.02 

4.61 

3.69 

-.09 

-.12 

||  . 

845 

.80 

5.29 

6.58 

4  86 

-.15 

-.20 

Ill  . 

. . . .do. . . . 

847 

.80 

7.06 

8.67 

5.02 

-.18 

-  22 

iv  . 

947 

.80 

10.06 

11.12 

6.60 

-.09 

-.13 

I  and  11  .... 

Rights 

847 

.80 

191.91 

183.60 

44.29 

.M 

.13 

1  and  11  . 

Wrongs 

84S 

.80 

9.31 

11.19 

7.47 

-.14 

-.19 

1  . 

Rights 

294 

.8] 

101.44 

92.00 

21.60 

.56 

.21 

II  . 

204 

.83 

10;,80 

90.80 

23.57 

.26 

.28 

Ill  . 

294 

.83 

93.50 

84.14 

21.22 

.25 

.30 

IV . 

294 

.83 

93.46 

82.94 

22.91 

.26 

.30 

I  . 

Wrongs 

294 

.81 

6.11 

7.30 

5.29 

-.13 

-.16 

n  . 

294 

83 

7.37 

7.44 

6.29 

-.01 

-.01 

in  . 

l  294 

.83 

9.28 

9.16 

6.05 

.01 

-.04 

IV . 

294 

.81 

11,33 

11.78 

6.99 

.04 

-.08 

I  and  II  . 

Rights 

294 

.8) 

203.24 

182.80 

42.59 

.23 

.26 

II!  am!  IV  .. 

_ do ... . 

294 

.83 

186.96 

167.08 

41.91 

.26 

.31 

1  and  II  . 

Wr  ,  gi 

294 

.83 

13.48 

14.74 

9.97 

-.07 

-.10 

Ill  and  IV  .. 

....  do ... . 

294 

.81 

20.61 

20.94 

11.68 

-.02 

.04 

1  . 

R-W 

294 

.83 

95.33 

84.70 

22.79 

.18 

.24 

11  . 

R- W 

294 

.83 

94.43 

83.36 

25.08 

.25 

.27 

Ill  . 

K-W 

294 

83 

84.22 

4.98 

22.77 

.23 

.29 

IV . 

R-W 

294 

.83 

82.13 

71.16 

24.29 

.25 

.31 

1  and  II  . 

R-W 

294 

.83 

189.76 

168.06 

44.70 

.23 

.2/ 

Ill  and  IV  .. 

R-W 

i 

294 

.83 

106.35 

146.14 

44.05 

.26 

.29 

'Samples  of  847  or  845  wore  tested  in  «hr  period  June  12  10  July  12,  1944;  the  sample  of 
294  was  tested  in  October  1944. 

'  Assuming  an  unrestricted  stanine  deviation  of  2.00. 


(5)  Intercorrclations. — Part-  core  intercorrclations  arc  given  in  table 
19.21.  The  correlation  between  rights  (parts  1  and  II)  and  wrongs 
(parts  I  and  II)  was  — 0. 1 3.  For  another  sample  of  294  pilots  tested  in 
October  1944,  the  correlation  was  —0.10.  For  this  same  sample,  for  the 
score  on  parts  III  and  IV,  the  correlation  lietween  rights  and  wrongs 
was  —0.05. 

Table  19.21. —  Part-score  intercorrclations  for  Position  Orientation,  CP526A, 


(S —847  classified  pilots') 


Part 

Score 

i 

2 

3 

4 

5 

6 

7 

8 

1.  1  . 

Rights 

0  70 

0.64 

0  64 

-0.10 

-Oil 

-0  04 

004 

2.  II  . 

. ... Jo ... . 

0  70 

„  .  . 

.68 

.71 

-.0* 

-.11 

-.06 

.04 

J.  1"  _ 

. . . .do. . . . 

.64 

.68 

.13 

-.09 

-.04 

-.11 

.02 

4.  IV  . 

. . . .do. . . . 

.64 

.71 

.83 

-.06 

-.01 

-.06 

-.05 

J.  |  . 

-.10 

—  .06 

—  .09 

—  .06 

.52 

.45 

.38 

4  II  . 

- do. . . . 

-.ii 

-n 

-.04 

-.01 

.52 

.46 

,J7 

7.  Ill . 

. . . .do. . . . 

-.04 

-.06 

-.11 

-.06 

.45 

.46 

.  .  « 

.64 

8.  IV  . 

• . . .do. . . . 

.04 

.04 

.02 

-.05 

.38 

.37 

.64 

'Tested  in  the  period  June  12  to  July  12,  1944. 


Evaluation  ; 

The  validity  coefficient  of  Position  Orientation,  jvirts  I  and  II,  was 
lower  (0.18  on  the  largest  sample  reported)  than  the  figure  obtained  on  j 
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the  original  validation  of  the  Hands  test  (0.26).  farts  III  and  IV  were 
even  less  valid.  Whether  this  difference  in  validities  is  due  to  factorial 
differences  between  the  original  Thurstone  form  and  the  revised  form 
of  tests  cannot  be  determined  from  present  data.  The  biscrial  r‘s  for  the 
smaller  sample  were  clearly  in  line  with  the  0.26  for  Thurstone’s  ver¬ 
sion,  however,  so  the  discrepancy  may  be  a  sampling  matter.  The  mean 
pilot  validity  based  upon  all  available  data  is  0.20,  which  is  slightly 
higher  than  the  expected  validity  of  0.16  (sec  ch.  28)  predicted  on  the 
basis  of  the  factorial  composition  of  the  Thurstone  test. 

Parts  I  and  II  correlate  highly  with  parts  III  and  IV  but  not  as  highly 
as  with  each  other.  There  is  apparently  some  slight  difference  in  the 
ability  to  select  a  right  hand  and  to  select  a  right  leg,  eye,  or  arm.  Here 
may  be  evidence  of  restricted  subfactors. 

The  chief  interest  in  this  test  is  in  its  factor  content.  Sixty  percent  of 
Hand's  known  common-factor  variance  was  found  on  a  single  factor. 
\'ot  more  than  8  percent  of  the  remaining  common  variance  accumu¬ 
lated  on  any  other  single  factor.  If  it  is  truly  a  relatively  pure  measure 
of  a  fac.or  hitherto  unknown,  it  is  of  great  value.  The  loading  in  that 
factor  should  be  improved  if  possible.  If  the  pilot  validity  of  the  Thur¬ 
stone  version  is  actually  0.26,  the  margin  between  this  and  the  expected 
0.16  means  either  that  the  sjwcc  II  factor  is  more  valid  than  was  as¬ 
sumed  (0.05)  or  there  is  unknown  valid  variance  in  the  test. 

Evaluation  of  Positional  Discrimination  Tesla 

Not  a  great  deal  more  is  known  nlxntt  these  tests  at  the  time  of  this 
writing  than  was  already  known  concerning  their  Thurstone  predecessors 
at  the  time  work  was  begun  on  the  revised  forms.  Validities  on  the  new 
forms  arc  generally  in  line  with  those  reported  earlier  for  Thurstone’s 
original  tests.  The  unique  element  that  tests  in  this  area  have  to  offer  is 
the  space  II  factor.  This  factor  probably  has  a  very  low  validity  for 
pilots,  but  a  good  test  of  it  is  needed.  Its  validity  for  navigators  is  un¬ 
known.  Further  study  is  recommended  to  define  the  factor  more  clearly 
and  to  maximize  its  variance  in  some  test. 

EVALUATION  OF  SPATIAL  TESTS 
Status  of  the  Area 

The  accumulated  data  on  tests  in  this  area  indicate  that  the  principal 
unique  functions  measured  can  lx*  explained  by  two  factors,  herein  la- 
l>clcd  sjiacc  I  and  space  II.  Space  I  apjwreiitly  lias  sonic  kinship  to 
Thurslone’s  spatial  factor  (factor  S).  although  it  is  Ixttcr  defined.  Only 
a  select  group  of  the  tests  found  on  factor  S  apjx-ar  on  space  I.  Hands, 
for  example,  originally  ap|xaring  with  its  principal  loading  on  Thur- 
stone's  space  factor,  split  away  from  the  space  tests  on  a  factor  of  its 
own  (space  II).  Another  factor  can  lx*  isolated  by  further  rotation  of 
Thurstonc's  original  published  loadings  which  resembles  the  visualiza- 
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lion  factor  described  in  chapter  13.  Substantial  visualization  variance  was 
apparently  contained  in  factor  S. 

The  pruning  effects  j>oint  out  the  need  of  further  analysis,  and  a  re¬ 
naming  of  factor  S.  Thurstonc's  description,  facility  in  spatial  .and 
visual  imager)',”  now  seemingly  fits  the  present  visualization  factor  better 
than  it  does  space  I.  Attempts  to  explain  the  basic  nature  of  the  factor 
have  already  l>ccn  related  in  this  chapter  under  the  heading,  Evaluation 
of  Directional  Discrimination  Tests.  The  problems  in  describing  and  nam- 
ing  space  II  were  outlined  in  the  Evaluation  of  Positional  Discrimination 
Tests,  the  subarea  just  preceding  this  chapter  evaluation. 

New  Research  Indicated 

Further  exploration  of  visual-spatial  tests  is  needed  along  factorial 
lines,  for  apparently  it  is  only  by  the  application  of  this  technique  that 
useful  conclusions  can  l>c  reached.  Factors  cannot  be  well  defined  until 
adequate  tests  arc  available. 

Another  line  of  research  suggested  by  the  statistical  results  on  tests 
in  this  area  and  in  the  visualization  area  reported  in  chapter  12  is  to 
study  the  effect  of  the  difficulty  level  of  the  items  on  factor  composition 
of  the  test.  It  has  been  especially  difficult,  for  example,  to  construct 
items  to  measure  visualization  that  do  not  involve  a  degree  of  reasoning. 
If  a  visualization  problem  is  made  too  difficult,  it  is  likely  solved  by 
reasoning.  By  reducing  the  difficulty,  reasoning  variance  seems  to  be  re¬ 
duced.  If  the  problems  arc  made  too  easy,  however,  they  can  be  solved 
witliout  visualization  possibly  by  space  I  ability.  In  the  Visualization  of 
Maneuvers  test,  an  opjiorttinily  to  observe  the  effect  of  complexity  is 
afforded.  Correlations  with  reasoning  and  spatial  tests  arc  available  for 
all  three  forms.  The  fe.rms  involving  the  more  complex  items  correlate 
more  highly  with  tests  known  to  measure  visualization,  while  the  form 
with  the  simpler,  more  speeded  items  correlates  more  highly  with  tests 
known  to  measure  space  I.  Factor-analysis  results  combined  with  sys¬ 
tematic  control  of  difficulty  arc  necessary  to  corroborate  these  evidences. 

IlIRMOGRAPHY 

(1)  Kelley,  T.  !..  Crottr, hhJj  in  tfu ■  SfinJ  of  hfiit t,  Stanford  University  Press, 
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(2)  Thur->tone.  I-  I.  t’rimary  .  fM.fiYr,  Psychometric  Monograph  No.  I, 

University  of  Chicago  Press,  1938. 
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CHAPTER  MUTT. 


Orientation  Tests1 


INTRODUCTION 

Among  the  traits  considered  essential  for  effective  air-crew  perform¬ 
ance  arc  the  abilities  to  determine  one's  bearings  with  respect  to  points 
of  the  compass  and  to  maintain  an  appreciation  of  one’s  location  relative 
to  landmarks  in  the  environment.  An  attempt  was  made,  by  the  use  of 
orientation  tests,  to  measure  these  abilities.  When  construction  of  the 
orientation  tests  was  begun,  few  instruments  that  measured  orientation 
of  any  type  existed.  The  few  orientation-test  items  available  were  con¬ 
sidered  definitely  inadequate. 

Job-Analysis  Information 

In  table  1.5  it  may  be  seen  that  in  two  samples  of  1,000  each  of  stu¬ 
dents  eliminated  from  elementary  pilot  training,  orientation  was  men¬ 
tioned  13  percent  and  15  percent  of  the  time  as  a  cause  for  elimination. 
For  100  cases  of  elimination  from  advanced  single-engine  training,  how¬ 
ever,  this  category  was  mentioned  only  6  percent  of  the  time;  and  for 
100  cases  of  elimination  from  advanced  twin-engine  training,  only  9 
percent.  In  one  sample  of  100  reclassifications  in  oj>crational  training, 
orientation  was  mentioned  only  2  percent  of  the  time,  and  in  another 
sample  of  the  same  size,  it  was  not  mentioned  at  all.  One  inference  from 
these  figures  could  be  that  orientation  is  a  much  more  important  factor 
in  early  pilot  training  than  in  later  stages.  Another  could  be  that  train¬ 
ing  eliminates  relatively  early  the  men  who  are  deficient  in  this  respect 
There  is  probably  some  truth  in  both  interpretations.  Defects  in  orienta¬ 
tion  show  up  conspicuously  in  the  student’s  failure  to  execute  maneuvers 
pmjierly  and  in  his  getting  lost. 

I’ndcr  combat  conditions,  the  imj>ortance  of  orientation  ability  is 
probably  much  greater  than  it  is  in  training.  Reference  to  table  1.6  will 
show  that  supervising  officers  rated  the  importance  of  observation  and 
orientation  for  pilots  as  fairly  high.  On  a  9-point  scale,  in  which  5 
means  better  than  average,  the  mean  ratings  were  7.2  for  fighter  pilots 
and  5.5  for  bomber  pilots.  For  bombardiers  and  navigators  in  combat 
(see  tables  1.2  and  1.4),  supervisors  rated  the  trait  of  orientation  and 
observation  highest  in  the  list  of  20  traits  with  a  mean  rating  of  7.8  for 
both  of  these  air  crew  positions.  Since  the  trait  on  the  rating  scales  is 

•  Wnltrn  bj  C»p«.  John  I.  I-Mt?  *«4  S(l.  S.  W.  Ktikm. 
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named  orientation  and  observation,  however,  it  is  not  clear  how  much 
o {  this  rating  was  based  on  orientation  and  how  much  on  observation. 

Two  Types  of  Orientation  Tests 

The  subareas  adopted  for  orientation  measurement  are  (1)  coinpass 
orientation  and  (2)  pattern  orientation.  This  a  priori  division  is  based 
on  tbc  sii|K-rfici,«i  .  cquinmients  of  the  various  orientation  tests.  Under 
the  rubric  of  compass  orientation  are  placed'  those  tests  that  require  the 
examinee  explicit}-  to  use  the  joints  of  the  compass.  The  tests  arc  Direc¬ 
tional  Orientation.  Following  Oral  Directions,  Compass  Directions,  and 
Compass  Orientation.  Spatial  Orientation,  Aerial  Landmarks,  and  Star 
Identification  are  considered  pattern-orientation  »*  's.  These  tests  re¬ 
quire  the  examinee  to  identify  geographical  parts  within  a  whole. 


COMPASS  ORIENTATION  TESTS 


Directional  Orientation,  CP515B  and  C  * 


Air  crew  members  must  be  familiar  with  the  points  on  the  compass 
;.nd  must  also  be  able  to  apprehend  directions  quickly  and  accurately  de¬ 
spite  various  conditions  conducive  to  disorientation.  Directional  Orien¬ 
tation  tests  were  designed  to  measure  the  speed  with  which  directions 
can  be  accurately  recognized  despite  various  degrees  of  rotation  of  the 
compass  rose  out  of  Us  usual  position  (as  conventionally  represented). 
It  was  thought  that  a  test  of  this  ability  is  analogous  to  a  direct  test  of 
the  ability  of  *  "dot  to  remain  directionally  oriented  in  spite  of  sudden 
and  frequent  e  ges  in  direction  of  flight. 

Description. — In  Form  R,  each  item  consists  of  six  circles,  with  one 
direction  indicated  on  each.  IueW  circle  is  rot;: f>-d  out  of  its  conventional 
position  on  the  page,  i.  e.,  with  noitl.  at  the  top  and  cast  to  the  right.  The 
first  circle  in  each  problem  is  called  the  “given  circle.”  The  task  of  the 
examinee  is  to  determine  which  circles  of  the  remaining  five,  if  super¬ 
imposed  on  the  given  circle,  would  have  indicated  directions  which  would 
lx  in  proper  relationship  to  the  indicated  direction  on  the  given  circle. 
Thus,  in  figure  20.1,  with  N  pointing  as  it  docs,  circles  1,  3,  and  4 
match  the  given  circle,  while  2  and  5  do  not. 

The  task  of  the  examinee  in  form  C  is  essentially  the  same,  but  the 
form  of  the  problems  is  somewhat  changed.  The  examinee  is  presented 
with  five  circles,  each  with  but  one  direction  indicated,  and  each  rotated 
out  of  |H>sitioii  as  in  Form  Ik  Through  each  circle  is  drawn  a  diameter, 
at  one  end  of  which  is  an  arrowhead.  The  examinee  must  determine 
which  of  these  circles  contain  arrowheads  pointed  in  a  given  direction. 
Thus,  in  figure  20.2,  the  arrows  in  1,3,  and  4  point  northeast,  the  given 
direction,  but  those  in  2  and  5  do  not. 


’  tV»«loM  »t  Office  of  the  Surgeon  Headqturters,  AAF  Training  Command.  Chief  con- 
inkulon:  M*|.  !»m m  T  <..l»on  inti  M»j.  George  F.  J.  I.ehnrr.  The.e  iwo  form.  »nd  form  A 
w?ff.  upon  I)r.  Paul  Wooflnng'i  research  on  directional  orientation,  the  manuscript  of 

which  wai  |rnrr<Mi«ly  made  avatl.tMe  to  the  Army  Aviation  Psychology  Program. 
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GIVEN  5 

CIRCLE  I  a  3  4  NC 

0  "0  Q  Q-O  0 

FIGURE  20.1 

PRACTICE  PROBLEM  OF  DIRECTIONAL  ORIENTATION, 

CP5I5  B 


OIVCN  I  2  3  *  4  3 

"»”■  0  0  0  0”  ©.. 

FIGURE  20.2 

PRACTICE  PROBLEM  OF  DIRECTIONAL  ORIENTATION, 

CP5I5C 

(1)  Internal  characteristics. — There  arc  2  unscorcd  sample  items  and 
28  scored  items  in  each  form. 

(2)  Administration. — Reading  of  the  directions  for  each  form  re-  • 
quires  from  3  to  5  minutes,  while  testing  time  is  18  minutes  for  Form  B 
and  20  minutes  for  Form  C,  allowing  approximately  50  percent  of  the 
examinees  to  finish. 

(3)  Scoring. — The  scoring  formula  is  R— \V.  The  maximum  score 
for  Form  B  is  70,  and  for  Form  C,  60.  While  separate  scores  for  Forms 
B  and  C  were  obtained,  they  were  combined  in  computing  some  of  the 
statistics  reported. 

Statistical  results. — Except  where  noted  below,  the  following  data  are 
for  examinees  tested  at  Psychological  Research  Unit  No.  3  in  March 
1943. 

(1)  Distribution  statistics.— A  sample  of  392  unclassified  aviation  stu¬ 
dents  yielded  a  mean  score  of  47.0  and  a  standard  deviation  of  17.4  for 
Form  B,  and  a  mean  of  42.6  and  a  standard  deviation  of  13.9  for 
Form  C. 

(2)  Internal  consistency. — The  degree  of  homogeneity  of  the  items  is 
indicated  by  a  mean  internal-consistency  phi  of  0.49,  a  standard  devia¬ 
tion  of  the  phi  distribution  of  0.21,  and  a  range  of  values  from  0.14  to 
0.85.  These  statistics  are  based  upon  analysis  of  the  responses  of  the 
highest  27  percent  and  the  lowest  27  percent  in  total  score  (the  two  forms 
combined)  of  a  group  of  360  unclassified  aviation  students. 

(3)  Reliability  coefficient. ■>—  Based  upon  a  sample  of  392  unclassified 
aviation  students,  a  correlation  of  0.74  was  found  between  forms  B  and 
C.  This  figure  provides  a  conservative  estimate  of  the  reliability  of  either , 
form. 

(4)  Correlation  betiveen  rights  and  wrongs. — For  a  sample  of  339 
pilots  tested  in  July  and  August  1944  ai  Psychological  Research  Unit 
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No.  3,  the  correlation  between  rights  and  wrongs  was  —0.48.  A  correla¬ 
tion  of  —0.36  was  secured  for  a  sample  of  751  navigators  tested  in  Feb¬ 
ruary  and  March  1644.  at  Sclinnn  and  Kllinglon  Fields  and  at  Psychologi¬ 
cal  Research  Unit  No.  3. 

(5)  Difficulty- -Based  upon  item  analysis  of  the  responses  of  360 
unclassified  aviation  students,  the  test  (forms  B  and  C  combined) 
yielded  a  mean  proportion  of  correct  responses  of  0.68,  corrected  for 
chance,  with  a  range  from  0.33  to  0.94  and  a  standard  deviation  of  0.17. 

(6)  Factorial  composition. — The  noteworthy  loadings  of  form  B  are 
in  the  spatial-relations  (0.41),  visual-memory  (0.36),  general-reasoning 
(0.31),  visualization  (0.26),  and  numerical  (0.22)  factors.  The  com¬ 
monality  is  0.56.  For  a  fuller  picture  of  the  factorial  composition  of  this 
test,  sec  appendix  B.  Form  C  was  not  analyzed  because  the  correlational 
patterns  indicated  that  the  two  tests  were  almost  identical  factorially. 

(7)  Test  validity. — Validation  results  based  on  several  samples  are 
given  in  table  20.1. 


Table  20.1.—  Validity  data  for  Directional  Orientation,  CP515B,  based  upon 
the  graduation-elimination  criterion 


Group 

Score 

N, 

f. 

SD, 

rM. 

Sh/ 

Pilot*  in  primary  training1  . . 

Rights  . 

339 

0.82 

60.63 

56.43 

11.23 

0.21 

0.3$ 

I)o»  . 

Wrongs 
R-W  .. 

339 

.82 

4.41 

4.36 

4,29 

.01 

—.16 

Do*  . 

339 

.82 

56.22 

52.07 

13.86 

.17 

.33 

Do*  . 

R-W/4 
R-W  .. 

339 

.82 

59. 5  J 

55.34 

11.79 

.20 

.35 

Pilot*  through  basic  training* 

S63 

.68 

38.33 

31.18 

17.85 

.34 

.31 

Navigator**  . 

Do*  . 

Rights  . 

751 

.91 

58.52 

51.76 

12.31 

.28 

.38 

Wrongs 

751 

.91 

2.79 

4.28 

4V08 

-.19 

-.27 

•  Assuming  an  unrestricted  stanine  standard  deviation  of  2.00. 

•Tested  July  6  to  Aug.  12.  1944  at  Psychological  Research  Unit  No.  3. 

•In  classes  4111  and  44C.  Tested  at  Psychological  Research  Unit  No.  3. 

•  Tested  Kcb.  II,  1944  at  Srlinan  Field,  Feb.  1  and  2,  1944  at  Elington  Field,  and  Mar.  21, 
1944  at  Psychological  Research  Unit  No.  3. 

Evaluation.- — Forms  B  and  C  of  Directional  Orientation  possess  a 
fairly  high  degree  of  homogeneity,  while  the  items  arc  not  particularly 
difficult.  As  portrayed  by  factor  analysis,  56  percent  of  the  total  variance 
of  form  B  has  been  accounted  for  by  common  factors.  Of  the  total  vari¬ 
ance,  spatial-relations  contributes  17  percent,  visual-memory  13  percent, 
general-reasoning  10  percent,  visualization  7  percent,  and  the  numerical 
factor  5  percent.  Most  of  the  remaining  4  percent  of  the  total  variance 
is  accounted  for  by  three  other  factors,  none  contributing  more  than 
1  percent. 

Against  the  pilot  criterion,  the  test  has  very  satisfactory  validity;  and 
it  has  a  higher  biserial  with  the  navigation  criterion.  Its  validity  for  the 
y’I'Jt  can  be  attributed  to  its  variances  in  spatial-relations,  visualization, 
and  visual  memory.  Its  validity  for  the  navigator  is  due  to  its  variances 
in  spatial-relations,  general-reasoning,  visualization,  and  numerical  fac¬ 
tors.  No  compass-orientation  factor  as  such  has  appeared,  but  this  may 
be  because  there  was  never  more  than  one  test  involving  compass  direc¬ 
tions  in  any  analyzed  battery. 
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Directional  Orientation,  CP515A  * 

Variation. — CP515A,  the  first  Directional  Orientation  test  constructed, 
has  the  same  purpose  as  forms  B  and  C,  but  its  administrative  directions 
and  item  construction  are  considerably  different.  Each  test  item  consists 
of  a  2  3/16  inch  circle  with  an  arrow  through  the  circumference  indicat* 
ing  north.  Within  each  circle  are  two  arrows,  labeled  B  and  C,  which 
represent  the  two  legs  of  the  flight  of  a  plane,  one  the  path  of  a  plane 
before  it  turns,  the  other,  the  path  after  it  turns.  It  is  the  examinee's 
task  to  give  (A)  the  direction  he  is  going  before  entering  the  turn,  and 
(B)  th;  direction  he  is  going  after  making  the  turn.  The  directions  are 
to  be  given  in  relation  to  north  as  indicated  on  the  circumference  of  the 
circle.  The  answers  are  recorded  by  filling  a  space  under  either  N,  NE, 
E,  SE,  S,  SW,  W,  or  NW.  There  are  36  items  in  the  test,  the  first  6  of 
which  are  practice  items.  Thus,  the  maximum  number  of  correct  re¬ 
sponses  is  60.  The  scoring  formula  is  R— W.  Figure  20.3  shows  three 
items  of  the  test 


FIGURE  20.3 

SAMPLE  ITEMS  OF  DIRECTIONAL  ORIENTATION. 

CP5I5A 

Statistical  results.  (1)  Distribution  statistics. — A  sample  of  392  un¬ 
classified  aviation  students,  tested  in  M.irch  1943  at  Psychological  Re¬ 
search  Unit  No.  3,  yielded  a  mean  score  of  47.0,  a  standard  deviation  of 
17.4.  The  distribution  curve  was  approximately  symmetrical  and  some¬ 
what  flatter  than  normal. 

(2)  Test  validity. — Validation  results  based  on  several  samples  are 
given  in  table  20.2. 


Table  20.2. —  Validity  data  for  Directional  Orientation,  CP515A,  using  the 
graduation-elimination  criterion 


Group 

N, 

Me 

M. 

SD, 

fM# 

Pilot*  in  primary  training*  . . . . 

Pilots  through  basic  training*  . . . 

Flexible  gunnel  in  training  . . 

$92 

563 

349 

0.78 

.67 

.95 

37.62 

38.33 

43.02 

31.14 

31.18 

34.47 

17.04 

17.8$ 

15.23 

0.21 

.34 

M 

*  Same  sample  followed  through  the  first  two  phases  of  training.  Ia  classes  44B  sod  44<U, 
tested  at  Psychological  Research  Unit  No.  3  in  April  1941, 


'Developed  at  Office  of  the  Surgeon.  Headquarters,  AAK  Training  Command.  Chief  con¬ 
tributors:  Raj.  James  J.  Gibson  and  Ma],  George  P.  J.  Lebner. 
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Evaluation. — The  distribution  statistics  indicate  that  the  items  in  the 
test  arc  of  moderate  difficulty  and  have  a  reasonable  amount  of  spread 
in  difficulty.  The  test  appears  to  be  moderately  valid  for  both  pilots  and 
flexible  gunners. 

Directional  Orientation,  CP515D,  E,  and  F  4 

The  rationale  for  the  construction  of  forms  D,  E,  and  I;  of  Direc¬ 
tional  Orientation  is  basically  the  same  as  that  for  forms  A,  B,  and  C. 
Aerial  photographs  are  employed  in  these  forms,  however,  while  dia¬ 
grams  arc  used  in  forms  A,  11,  and  C.  The  use  of  photographs  allows 
not  only  for  rotation  of  compass  directions,  but  also  for  presenting  views 
at  various  angles,  c.  g.,  vertical  and  oblique. 

Description. — The  items  arc  composed  of  circular  parts  of  aerial  photo¬ 
graphs,  2  3/16  inches  in  diameter.  Both  photographs  in  a  pair  are  of 
the  same  portion  of  the  landscape,  but  the  second  one  is  rotated.  A 
single  compass  di.ection  is  indicated  in  the  first  photograph,  and  an 
arrow  is  drawn  showing  an  unnamed  direction  in  the  second.  The  exami¬ 
nee’s  task  is  to  determine  the  compass  direction  of  the  arrow.  In  forms 
D  and  F,  vertical  views  are  used  for  both  photographs ;  while  in  form 
E,  the  initial  photograph  is  a  vertical  view  and  the  second  photograph  is 
an  oblique  (and  rotated)  view  of  the  same  terrain.  Forms  D  and  E  are 
printed  together  in  one  booklet  as  parts  I  and  II.  Form  F  is  a  reprint  of 
form  D,  designed  for  administration  in  a  sjx*cial  intercorrelational  study. 

(1)  Internal  characteristics. — Form  D  includes  1  recorded  but  un- 
scorcd  sample  item  and  44  scored  items.  Form  E  contains  3  recorded 
but  unscored  practice  items  and  45  scored  items.  Figure  20.4  shows  two 
items  of  form  D,  and  figure  20.5  shows  two  of  form  F.. 

(2)  Administration. — Answers  are  recorded  directly  on  a  special  IBM 
answer  sheet  with  marking  spaces  after  each  item  number  labeled  N  S 
E  \V,  standing  for  north,  south,  east,  and  west.  Eight  minutes  are  al¬ 
lowed  for  part  I,  16  minutes  for  part  IF,  and  6  minutes  are  considered 
adequate  for  administration  of  directions.  The  total  testing  time,  there¬ 
fore,  is  30  minutes. 

(3)  Scoring. — The  scoring  formula  is  R  —  W/3. 

Statistical  results.  (1)  Distribution  statistics.— -For  a  sample  of  500 
unclassified  aviation  students  tested  in  July  and  August  1944,  at  Psy¬ 
chological  Research  Unit  No.  3,  and  in  March  1945,  at  Psychological 
Research  Unit  No.  2,  the  mean  scores  (rights  only)  were  29.5  and  19.9 
for  Forms  D  and  E  respectively;  the  standard  deviations  were  8.7 
and  6.3. 

(2)  Reliability  coefficient. — No  satisfactory  estimates  of  reliability  arc 
available.  The  correlation  between  Forms  I)  and  K,  however,  foe  the 
above-mentioned  sample  of  500  unclassified  students,  was  0.36.  Since 
the  forms  arc  different,  this  is  a  gross  underestimate  of  the  reliability, . 

«nVwc.IO£?f|«  {WSS-1  Unit  N°'  y  °"et  con,rihwor»:  S«-  Hrnun  Hclltr 
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FIGURE  20.4 

SAMPLE  ITEMS  OF  DIRECTIONAL  ORIENTATION, 

CP5I50 


i 

I 

j 


FIGURE  20.5 

SAMPLE  ITEMS  OF  DIRECTIONAL  ORIENTATION, 

CP5I5E 
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Evaluation. — The  utilization  of  aerial  photographs  in  these  forms, 
while  adding  face  validity,  has  probably  had  the  etTcct  of  introducing 
some  perceptual-speed  variance  into  the  test.  Korins  D  and  E  have  a 
very  modest  intercorrclation ;  apparently  the  addition  of  the  feature  of 
oblique  views  in  Form  E  has  introduced  new  factors,  quite  possibly  held 
in  common  with  Aerial  Landmarks,  CP525  (see  pp.  535f.). 

Following  Oral  Directions,  0651 AX  * 

When  this  test  was  designed,  two  rationales  were  advanced  to  justify 
its  construction.  First,  it  was  regarded  as  an  integration  test  (see  ch. 
10).  The  rationale  for  all  integration  tests  also  applies  here.  Second,  it 
was  regarded  as  a  directional-orientation  test.  It  was  considered  to  be 
especially  important  for  the  combat  pilot  to  be  able  to  react  quickly  and 
correctly  to  the.  movements  of  enemy  aircraft.  He  must  rapidly  alter  the 
course  of  his  airplane,  making  continuous  changes  in  both  direction  and 
altitude  in  order  to  out-mancuvcr  enemy  planes.  All  of  this  time  he  must 
maintain  correct  orientation  and,  in  addition,  maintain  a  constant  state 
of  readiness  for  surprise  attacks  from  new  positions. 

Description. — Verbal  descriptions  are  presented,  by  means  of  phono¬ 
graph  records,  of  planes  flying  and  being  attacked  by  enemy  planes  from 
various  directions.  The  examinee  is  directed  to  imagine  that  his  plane 
has  executed  a  maneuver  according  to  rules  laid  down  in  preliminary  in¬ 
structions.  The  answer  to  the  item  is  the  direction  in  which  the  plane  is 
flying  at  the  end  of  the  maneuver.  The  items  are  made  more  difficult  as 
the  test  progresses,  by  increasing  the  number  of  attacking  planes  from 
one  to  two,  and  by  requiring  the  examinee  to  adhere  to  rules  governing 
altitude  as  well  as  direction. 

(1)  Internal  characteristics. — In  part  I,  the  problems  involve  one  at¬ 
tacking  plane  and  require  an  answer  in  terms  of  changed  direction  only. 
In  part  II,  two  attacking  planes  must  be  considered  in  sequence,  but  the 
answer  again  is  in  terms  of  changed  direction  only.  Farts  III  and  IV 
require  answers  in  terms  of  changed  direction  and  altitude.  Part  III 
involves  one  attacking  plane,  while  part  IV  involves  two  planes  atlacking 
in  sequence. 

The  directions  contain  live  unscored  sample  items.  There  arc  5  items 
in  part  I,  presented  at  4-sccoud  intervals;  10  in  part  II,  at  8-second  in¬ 
tervals;  5  in  part  III,  at  6-second  intervals;  and  35  in  part  IN',  at  15- 
sccond  intervals.  Parts  I,  II,  and  III  were  designed  as  gradual  training 
for  the  complicated  part  IV,  which  was  considered  to  be  the  heart  of  the 
test.  This  accounts  for  the  part-differences  in  uumlier  of  problems.  The 
items  are  presented  in  scries  of  five. 

(2)  Administration. — Each  examinee  receives  a  work  sheet  and  a 
15-place  IBM  answer  sheet.  Answers  are  marked  directly  on  the  work 
sheet  and  must  be  transcribed  to  the  IBM  answer  sheet  after  the  test  is 

*  Developed  at  Piychological  Rrxarch  Unit  No.  J.  Chief  contrilmtora:  S/Sgt.  J,  (to rdon 
Ktkin,  Sgt.  Nathan  Kravetr.  and  T/Sgt.  Sanford  /.  Motlt. 
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completed.  The  transcribing  of  answers  takes  approximately  10  min¬ 
utes,  The  total  administration  and  testing  time  is  approximately  45 
minutes. 

Relevant  parts  of  the  directions  to  part  I,  which  illustrate  the  task  of 
the  examinee,  follow: 

This  is  a  test  to  see  how  well  you  can  keep  several  different  facts  in  your  mind 
while  receiving  orders  to  change  course  *  *  * 

If  an  enemy  plane  attacks  from  your  right,  you  will  turn  to  the  left.  If  it  attacks 
ftom  the  left,  you  will  turn  right,  li  you  are  attacked  from  the  rear,  you  will  con¬ 
tinue  on  your  course,  and  if  you  arc  attacked  head  on,  you  will  reverse  your 
directions. 

These  left,  right,  front,  rear  movements  will  be  changed  into  compass  directions— 
north,  south,  east,  west. 

For  example— if  you  arc  told  that  you  arc  heading  north  and  are  attacked  from 
the  left  you  will,  according  to  the  rule,  turn  right,  which  in  this  case  would  be  cast. 
Fast,  then,  is  the  new  direction  of  your  plane. 

Now  if  you  are  told  that  you  arc  being  attacked  from  the  west,  you  will  continue 
on  your  course  to  the  cast,  because  the  west  in  this  case  is  the  rear. 

The  examinee  is  told  that,  within  each  series  of  problems,  *  *  *  the  direc¬ 
tion  you  start  from  in  each  problem  will  be  the  direction  you  gave  as  the  answer 
in  the  previous  problem. 

The  examinees  record  their  answers  by  marking  N,  S,  E,  or  W  after 
the  number  of  the  problem  on  the  work  sheet. 

The  sample  problems  of  part  I  are: 

Write  the  answer  to  this  first  problem  on  your  work  sheet  opposite  number  L 

1  You  arc  Hying  west  and  arc  attacked  from  the  left  Write  on  the  work  sheet 
the  direction  in  which  you  arc  now  headed.  This  is  the  new  direction  of  your  plane. 
You  will  make  your  next  move  from  this  direction.  I  shall  repeat  No.  1.  You  are 
Hying  west  a'  J  are  attacked  from  the  left 

2.  You  arc  attacked  from  the  south. 

J.  You  arc  attacked  from  the  right 

4.  A  plane  attacks  from  the  rear. 

5.  Attack  comes  from  the  left 

The  item  numbers  arc  read  each  time  before  reading  the  item.  The 
answers  to  the  sample  items  are  given,  and  the  rules  of  the  test  are  again 
cmpliasizcd  before  starting  part  I. 

Relevant  parts  of  the  directions  for  part  II  are: 

In  the  next  series  of  maneuvers  you  will  be  attacked  by  two  enemy  planes  in 
rapid  succession.  These  planes  will  approach  one  after  the  other  from  the  same  or 
different  directions.  You  are  to  change  your  course  as  the  attacks  are  made.  Rcmcm- 
Imt,  you  change  your  course  in  responding  to  one  plane  before  you  consider  the  next 
After  you  have  evaded  the  first  attacking  plane,  you  are  no  longer  concerned  with  it 
and  you  immediately  maneuver  to  escape  the  second  plane  •  •  •  your  answer 
will  be  the  direction  in  which  you  are  headed  after  both  attacks. 
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In  part  III  the  examinee  is  told  that  his  actions  will  be  affected  by 
altitude  as  well  as  direction.  The  instructions  administered  in  parts  1 
and  II  still  apply  to  part  III  with  the  following  addition: 

If  any  enemy  attacks  from  a  higher  altitude  titan  your  own,  >ou  dive.  If  you  are 
are  attacked  from  a  lower  altitude,  you  climb. 

Practice  problems  are  administered  and  explained  before  commencing 
part  III. 

In  part  IV  the  examinee  is  told  that  he  will  be  subjected  to  attack 
from  the  same  or  different  direction  and  altitude  by  two  planes  in  suc¬ 
cession  and  that  he  will  complete  the  movement  necessary  to  escape  the 
first  plane  before  responding  to  the  second.  The  answer  is  the  new  direc¬ 
tion  after  both  enemy  attacks  have  been  evaded. 

Two  illustrative  problems  arc  explained  before  beginning  part  IV. 

At  the  conclusion  of  part  IV,  answer  sheets  arc  distributed  so  tliat  the 
responses  from  the  work  sheets  may  be  transcribed. 

(3)  Scoring. — The  scoring  formula  is  R— W/5. 

Statistical  results.  (1)  Distribution  statistics. — A  sample  of  1,302 
classified  pilots  in  class  44E  (tested  at  Psychological  Research  Unit  No. 
3)  yielded  a  mean  score  of  37.2  and  a  standard  deviation  of  9.8.  The 
distribution  curve  is  moderately  negatively  skewed. 

(2)  Internal  consistency. — Rased  on  800  pilots  tested  in  October  1943 
at  Psychological  Research  Unit  No.  3,  an  item  analysis  was  made  for 
the  highest  25  percent  and  the  lowest  25  percent  of  the  eases  in  total 
score.  The  mean  internal-consistency  phi  is  0.40,  the  standard  deviation 
0.10,  and  the  range  is  from  0.10  to  0.60. 

(3)  Reliability  coefficient. — By  the  alternate- forms  method,  an  esti¬ 
mated  reliability  coefficient  of  0.70,  corrected  for  length,  was  obtained. 
This  figure  is  based  on  a  sample  of  278  unclassified  aviation  students 
tested  in  September  1943  at  Psychological  Research  Unit  No.  3.  Alter¬ 
nate  forms  were  secured  by  arbitrarily  dividing  the  test  into  first  half 
and  second  half.  Since  these  two  halves  arc  not  entirely  comparable, 
0.70  is  probably  an  underestimation  of  the  test’s  reliability. 

(4)  Difficulty.  -For  analytical  purposes,  parts  I,  II,  and  III  were 
grouped  together  and  compared  with  part  IV.  For  the  first  three  parts, 
the  mean  proportion  of  correct  rcsjxjnses,  corrected  for  chance  success, 
is  0.80,  wth  a  range  from  0.65  to  0.97  and  a  standard  deviation  of  0.08. 
For  part  IV,  the  corrected  figures  arc:  Mean,  0.58;  range,  0.29  to  0.77, 
and  standard  deviation,  0.12.  These  data  arc  based  upon  the  above- 
mentioned  sample  of  800  pilots. 

(5)  Factorial  composition. — The  strongest  loadings  were  found  in  the 
spatial-relations  (0.28),  general-reasoning  (0.27),  integration  II  (0  25), 
numerical  (0.21),  and  visualization  (0.20)  factors.  The  conunnnality  is 
0.42.  For  a  fuller  picture  of  the  factorial  comj>osition  of  this  test,  see 
appendix  B. 
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(6)  Test  validity. — Validation  results  arc  j;i vci*  in  tabic  20.3. 


T Afire  20  3. —  Validity  data  for  Jolloumj  Oral  Directions,  CI&51A,  graduation- 

elimination  criterion 


Croup 

D 

m 

». 

M. 

sn, 

fl»» 

IMol*  in  primary  training*  . 

|Mot«  through  biMC  training1  . 

t.JOi 

1,202 

0.91 

.86 

17.60 
17  76 

14.00 

14.22 

9.7$ 

9.74 

0.19 

.20 

0.24 

.2$ 

1  Ai'unnni  an  unrrtlriclcil  alanine  tlanJa/4  deviation  ot  2.00. 
‘In  clau  4 IF..  Tnlnl  al  I'ajchological  Krvtarch  Unit  No.  ). 


Evaluation. — The  adequacy  of  the  training  provided  in  parts  I,  II, 
and  III  is  unknown.  Tart  IV,  however,  is  considerably  more  difficult 
than  the  preceding  parts,  even  though  the  training  probably  aids  per¬ 
formance  in  it.  The  simple  and  complex  parts  of  this  test  deserve  sepa¬ 
rate  analysis  and  exploitation. 

The  reliability  is  probably  satisfactory,  but  the  pilot  validity  is  only 
moderate.  Only  42  jKTcent  of  the  total  variance  of  this  test  has  been  ac¬ 
counted  for  by  common  factors.  Of  the  total  variance,  the  spatial-rela¬ 
tions  factor  accounts  for  d  jxTcent,  general  reasoning  7  percent,  and 
integration  II  6  j»crccnt.  The  major  portion  of  the  remaining  vari¬ 
ance  is  well  dispersed  among  four  other  factors,  none  contributing  more 
than  4  percent.  Contrary  to  what  might  be  expected  for  a  verbal  test  of 
this  kind,  the  verbal  factor  accounts  for  only  3  percent  of  the  total  vari¬ 
ance.  The  probable  explanation  i>  that  the  verbal  level  of  the  test  is  so 
low  that  the  differences  in  verbal  comprehension  among  the  examinees 
arc  inconsequential.  Since  the  comnumality  is  so  much  lower  than  the  re¬ 
liability,  the  primary  future  interest  in  this  test  should  be  in  identifying 
its  unknown  variance.  Since  the  tc<t  is  exceed* ugly  complex,  it  has  little 
or  no  value  as  a  classification  instrument.  It  is  possible  that  separate  an¬ 
alyses  of  the  parts  would  separate  the  factor  variances  somewhat.  Since 
the  pilot  validity  predicted  from  its  factor  loadings  (0.20)  is  close  to  the 
obtained  validity  (0.24),  no  new  factor  valid  for  pilots  is  promised  by 
the  test 

Variations. — Two  variations  were  constructed  because  it  was  thought 
desirable  to  pros  !c  two  tests,  one  composed  of  the  simple  items  of  part 
I  of  the  AX  form,  and  another  composed  of  the  more  complex  items 
of  part  IV  of  that  form.  This  decision  was  reinforced  when  *t  was 
found,  for  a  sample  of  270  cases,  that  the  items  in  the  first  lialf  of  the 
test  (parts  I,  II,  III,  and  some  items  of  part  IV)  correlated  0.26  and 
0.22  with  Mathematics  B,  CI2C0C,  and  Figure  Analogies,  CI2I2AX1; 
whereas  the  items  in  the  last  of  the  test  (remaining  items  of  part  IV) 
Iiad  correlations  of  0.35  and  0.38  respectively.  Apparently,  then,  the  more 
complex  items  in  CIf>51AX  have  more  reasoning  content  than  the  simpler 
items. 
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Following  Oral  Directions,  CP651BX 

This  form  is  an  expanded  version  of  part  I  of  CP651AX.  There  are 
5  unscored  practice  items  and  145  scored  items.  Four  seconds  are  al¬ 
lowed  for  each  practice  item.  lions  6  through  150  are  administered  by 
phonograph  with  2-second  intervals  between  items.  Part  1  is  made  up  of 
items  6  through  75,  while  part  11  consists  of  items  76  through  150.  At 
the  conclusion  of  the  test,  the  responses  are  transcribed  to  an  IBM  an¬ 
swer  sheet.  Testing  time,  directions,  and  transcription  require  approxi¬ 
mately  25  minutes.  The  scoring  formula  is  R  — W/3. 

Statistical  results.  (1)  Distribution  statistics. — For  a  sample  of  1,167 
pilots  in  class  44G,  tested  at  Psychological  Research  Unit  No.  3,  the 
mean  score  was  116.6  and  the  standard  deviation  29.3. 

(2)  Internal  consistency.— The  degree  of  homogeneity  of  the  items  is 
indicated  by  a  mean  internal-consistency  phi  of  0.39,  a  standard  devia¬ 
tion  of  the  phi  distribution  of  0.09,  and  a  range  of  values  from  0.15  to 
0.54.  These  statistics  are  based  u|x>n  analysis  of  the  responses  of  the 
highest  27  percent  and  the  lowc-t  27  jx  rcent  of  a  group  of  750  unclassi¬ 
fied  aviation  students  tested  in  May  1944  at  Psychological  Research 
Unit  No.  3. 

(3)  Reliability  coefficient. — I’y  the  alternate- forms  method,  an  esti¬ 
mated  reliability  coefficient  of  0.85,  corrected  for  length,  was  obtained, 
based  on  a  sample  c*  487  pilots  in  class  44G,  tested  at  Psychological  Re¬ 
search  Unit  No.  3.  For  1.6S2  navigators,  the  corrected  reliability  of  rights 
is  0.82,  and  for  wrongs,  0.77.  These  navigators  were  tested  in  April. 
May,  and  June  1944  at  the  three  Psychological  Research  Units. 

(4)  Correlation  between  rights  and  wrongs. — Data  on  the  correlation 
of  rights  and  wrongs  are  given  in  table  20.4. 


T.\*le  20.4. —  Correlation  between  rights  and  xerongs  for  Following  Oral 

Directions,  C1651BX 
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1  Tnlfd  from  M*jr  f,  If  14  lo  Au<.  I.*.  1  *11  at  I'  jiholofiol  Rrsriffli  I’nit  N#  1 
1  Tnff<l  June  1  jr«f  ’  IOI4  if  I'mcM  s  icj?  Rr  ortk  I’nl  No  I.  Ape.  \?  fo  .1,  lfl4  it 
l*'fcho!cfkiI  HcsxarcH  L  nil  No  I:  inJ  May  t  to  &,  it  PtpKolopal  Krv*im  l‘ml  No.  S. 


(5)  Difficulty. — Based  iijxm  a  sample  of  800  unclassified  aviation 
students  tested  at  Psychological  Research  Unit  No.  3  in  May  1944,  the 
mean  proportion  of  correct  responses,  corrected  for  chance  success,  is 
0  68,  with  a  range  from  0.33  to  0  95,  and  a  standard  deviation  of  0  15. 

(6)  Test  x\didity — Validation  results  arc  given  in  tabic  20.5. 
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(7)  Item  validity. — Validation  o{  items  Tevealed  a  mean  phi  of  0.07, 
with  a  range  from  —0.06  to  +0.20,  based  upon  the  responses  of  600 
graduates  and  117  eliminees  from  primary  pilot  training  in  class  44G, 
tested  in  December  1943  and  January  1944  at  Psychological  Research 
Unit  No.  2. 

Evaluation. — The  reliability  of  this  test  is  quite  satisfactory.  Its 
validity  for  pilots  seems  to  be  about  the  same  as  for  the  AX  form.  Navi¬ 
gator  validities  are  considerably  higher.  One  noteworthy  finding  is  that 
the  rights  alone  are  perhaps  more  valid  for  navigators  than  is  a  formula 
score.  The  wrongs  have  considerable  dispersion  and  validity  in  their  own 
right,  and  a  study  of  optimal  scoring  formula  seems  indicated. 

Following  Oral  Directions,  CI651CX 

This  form  includes  the  same  type  of  items  and  administrative  direc¬ 
tions  as  part  IV  of  CI651AX.  There  arc  3  practice  items  and  40  test 
items,  with  15-sccond  intervals  between  test  items.  Administration,  test¬ 
ing,  and  transcription  of  answers  require  approximately  40  minutes.  The 
scoring  formula  is  R— W/5.  No  statistical  data  concerning  this  test  are 
available  at  present 

Compass  Orientation,  CI660A 1 2 *  4 

This  test  was  developed  as  a  result  of  the  promising  validity  reports 
on  Following  Oral  Directions,  and  it  is  an  attempt  to  provide  a  simple 
and  pure  test  of  the  function  it  was  thought  was  unique  in  Following  Oral 
Directions.  This  unique  contribution  was  thought  to  consist  of  the  meas¬ 
urement  of  the  ability  to  orient  rapidly  to  changes  in  directions. 

Description. — In  each  item,  one  of  the  four  compass  directions,  north, 
south,  east,  or  west,  is  presented  as  an  initial  direction  of  flight.  Then  a 
turn,  either  left  or  right,  is  given.  The  examinee’s  task  is  to  record  the 
new  direction  of  flight  after  the  turn  is  made.  The  mode  of  presentation 
of  the  items  is  as  follows : 


firm 

Yon  «n 

Ami 

fmrm 

Nrw  Direction 

80. 

North 

Uft 

81. 

West 

right 

82. 

East 

left 

(1)  Internal  characteristics. — The  instructions  contain  2  items  with 
the  correct  answers  marked  and  28  recorded  but  unscorcd  practice 
items.  There  are  150  scored  items  in  the  test 

(2)  Administration. — Each  examinee  receives  a  work  booklet  with 
directions  and  printed  items.  Answers  arc  marked  directly  in  the  work 
booklet  and  must  be  transcribed  to  the  standard  IBM  answer  sheet  when 

»t  riyfk«hti:,l  XtHWtk  Unit  X*.  1.  Ck*»l  CMtiiWlwi:  Cftpt.  U»y4  G. 

Hampkrtys  r*l.  J  inn  A.  W»ll«,  u4  Uii  G.  Wii,k 
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the  testing  period  is  ended.  The  examinee  is  allowed  50  seconds  to  com¬ 
plete  the  28  sample  items.  Five  minutes  are  allowed  for  working  the 
150  scored  items  in  the  test.  Administration  consumes  approximately  2 
minutes,  and  transcribing  takes  about  5  minutes,  making  a  total  testing 
time  of  approximately  12  minutes.  The  examinee  is  told  that  the  test  is 
a  speed  test  and  that  his  score  will  be  simply  the  number  of  correct  re¬ 
sponses. 

i 

(3)  Scoring. — Rights  and  wrongs  were  scored  separately. 

Statistical  results.  (1)  Distribution  statistics. — A  sample  of  578  un¬ 
classified  aviation  students  tested  in  October  1944  at  Psychological  Re¬ 
search  Unit  No.  2  yielded  a  mean  rights  score  of  95.7  and  a  standard 
deviation  of  32.8.  The  distribution  curve  is  moderately  negatively  skewed 
and  somewhat  flatter  than  normal. 

(2)  lntenuil  consistency. — The  degree  of  homogeneity  of  the  items 
of  the  test  is  indicated  by  a  mean  internal-consistency  phi  of  0.26,  a 
standard  deviation  of  the  phi  distribution  of  0.15,  and  a  range  of  values 
from  —0.15  to  0.95.  These  statistics  arc  based  upon  an  analysis  of  the 
responses  of  the  highest  27  percent  and  the  lowest  27  percent  of  a  group 
of  750  unclassified  aviation  students  tested  in  July  1944  at  Psychological 
Research  Unit  No.  3. 

(3)  Difficulty. — Rased  upon  the  responses  of  the  above-mentioned 
sample  of  750  unclassified  aviation  students,  the  test  yielded  a  mean  pro- 
|x>rtion  of  correct  responses  of  0.88,  corrected  for  chance,  with  a  range 
from  0.43  to  1.00  and  a  standard  deviation  of  0.10. 

(4)  Test  ivtidily. — Validation  results  based  on  a  single  sample  are 
given  in  table  20.6. 


Taut.  206  —  I’alidity  data  fur  Com  fast  Orientation,  Ct660A,  based  upon  a  sample 
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July  9  to  Oct.  7  190.  »«  P.ycholonic.l  Re.fircb  Unit  No. 
Assuming  an  unrestricted  Matiinc  Mintlird  deviation  of  2.00 


Evaluation.  The  negative  skew  of  the  distribution  and  the  high  pro- 
I portion  of  correct  responses  indicate  that  the  items  in  the  tese  arc  rela¬ 
tively  easy,  as  they  should  be  in  a  speed  test.  The  validity  figures  for 
pilots  arc  not  impressive.  It  is  possible  that  a  navigator  or  bombardier 
criterion  would  yield  better  results.  The  test  has  not  been  factor  analyzed. 
Correlational  data  available,  however,  indicate  that  this  test  would  de¬ 
fine  a  lieu  factor  on  which  Following  Oral  Directions  and  Directional  Ori¬ 
entation  would  have  moderate  loadings.  The  simplicity  and  case  of  ad¬ 
ministration  of  this  test  are  most  appealing. 
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Compass  Directions,  CP524A  1  ' 

This  is  a  test  of  the  examinee's  ability  to  reorient  himself  to  a  particu¬ 
lar  ground  pattern  quickly  and  accurately  when  compass  directions  are 
shifted  about. 

Description. — For  every  five  test  items  there  is  a  schematic  circular 
map  which  represents  an  aerial  view  of  the  ground,  like  that  shown  in 
figure  20.6.  These  diagrams  include  such  landmarks  as  streams,  roads, 
supply  dumps,  airports,  and  villages.  For  each  item,  a  statement  is  given 
that  establishes  arbitrary  directional  relationships  on  the  map.  The  ex¬ 
aminee  is  then  required  to  answer  a  question  in  terms  of  the  rotated 
compass  points.  Sample  problem  I,  for  example,  which  is  used  in  the 
directions  to  explain  the  test  to  the  examinees,  reads  as  follows: 
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FIGURE  20.6 

SAMPLE  SCHEMATIC  MAP  OF  COMPASS  DIRECTIONS. 

CP524A 

“The  forest  is  due  north  of  the  airport.”  (Sec  fig.  20.6.)  You  arc  flying  over  the 
hospital.  In  which  direction  must  you  lly  la  reach  the  airport?  (A)  North,  (II) 
southwest,  (C)  southeast,  (D)  northeast,  (F.)  east. 

The  directions  then  continue: 

The  statement  that  the  forest  is  north  of  the  airport  tells  you  that  for  thi*  prob¬ 
lem  north  is  at  the  bottom  of  the  map,  south  at  the  top,  east  at  the  left,  and  west  at 
the  right.  The  airport  is  to  the  south  and  to  the  ca't  of  the  hospital.  The  correct 
answer  in  this  case  is  southeast,  which  is  choice  (C). 

'  Developed  at  Medical  and  Psychological  Examining  Unit  No.  1.  Chief  contributors:  CpL 
Mferri*  Frtrdman  and  Sfi.  Robert  Irvine. 
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Other  problems  in  the  test  call  for  answers  in  terms  of  compass  head¬ 
ings  in  degrees.  Thus,  sample  problem  2,  also  used  in  the  directions, 

reads: 

The  oil  storage  tanks  are  east  of  the  airport.  You  arc  taking  off  from  the  airport 
to  bomb  an  enemy  position  in  the  forest.  Your  compass  heading  will  be:  (A)  180*, 
<B)  90\  (C)  270*.  (D)  45\  (E)  0*. 

(1)  Internal  characteristics. — There  are  2  sample  items  (quoted 
above)  and  30  test  items,  5  to  a  diagram. 

(2)  Administration. — The  principles  involved  in  the  test  and  the  two 
sample  problems  arc  explained  before  beginning  the  test.  The  exami¬ 
nees  arc  cautioned  not  to  turn  the  test  booklets  in  order  to  get  north  into 
its  usual  position.  A  testing  time  of  20  minutes  is  allowed,  although  in 
one  administration  of  this  test,  the  time  was  25  minutes. 

(3)  Scoring. — The  scoring  formula  is  R— W/4. 

Statistical  results.  (1)  Distribution  statistics. — Typical  examples  of 
distribution  statistics  are  given  in  table  20.7.  The  distribution  is  slightly 
negatively  skewed. 


Table  20.7.—  Distribution  constants  for  Compass  Directions,  CP524A 
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•Tested  Mar.  6-8,  1944  at  Medical  ami  Psychological  Examining  Unit  No.  7,  with  a  25- 
minute  lest  period. 


’Toted  May  22  and  23,  1944  at  Psychological  Research  Unit  No.  2,  with  a  20-minute  teat 
period. 

(2)  Test  validity. — Validation  results  based  on  a  single  sample  are 
given  in  table  20.8. 


Taui.k  20.8.—  Validity  data  for  Compass  Direction,  CP524A,  based  upon  a  sample 
of  3SI  navigators  (Pi— 0.93;  graduation-elimination  criterion)* 
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•  Tested  May  22  amt  23,  1944  at  P.-ychulogical  Research  Unit  No.  2. 

*  Assuming  an  unrestricted  stanine  standard  deviation  o(  2.00. 


Evaluation. — This  test  has  considerable  promise  as  a  satisfactory  pre¬ 
dictor  of  navigator  success.  The  validity  for  the  navigator  training  cri¬ 
terion  (0.63)  is  one  of  the  highest  test  validity  coefficients  reported  in 
the  Aviation  Psychology  Program.  The  test  is  probably  quite  reliable,  and 
the  difficulty  level  appears  to  be  satisfactory.  It  has  a  feature  not  com¬ 
mon  to  those  discussed  before  in  that  it  requires  the  examinee  to  shift 
orientation  successively  to  one  stimulus  background.  How  important  this 
is,  and  what  variance  it  introduces,  is  unknown.  The  problem  deserves 
serious  study. 
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FIGURE  L0.6 

SAMPLE  ITEMS  OF  SPATIAL  ORIENTATION, 

CP5036 


PATTERN  ORIENTATION  TESTS 
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Spatial  Orientation,  CP501 B-CP503B  * 

The  success  of  an  aerial  mission  is  frequently  dependent  upon  the  abil¬ 
ity  of  air-crew  members  to  identify  points  or  ireas  on  the  ground  which 
bare  been  depicted  to  them  in  photographs  or  maps.  Tests  CP501B  and 
CP503B  were  designed  to  measure  this  ability.  Various  forms  of  both 
these  tests  have  been  in  the  classification  battery  since  August  1942. 

Description. — The  two  tests  are  referred  to  as  part  I  and  part  II  re; 
spectively,  since  they  appear  in  the  same  booklet.  For  part  I  there  is  a 
large  aerial  photograph  at  the  top  of  each  page,  while  directly  below  it 
are  six  circular  photographs,  1  9/16  inches  in  diameter,  which  are  parts 
of  the  larger  one  above.  At  various  points  in  the  large  photographs  are 
letters  A  through  II.  Each  of  the  smaller  photographs  has  a  number 
below  it  that  designates  the  number  of  the  Hem.  In  performing  the  test, 
the  examinee  scans  the  large  photograph  and  finds  the  area  that  matches 
the  small  photograph.  The  letter  in  the  large  photograph  nearest  the 
selected  area  is  the  answer  to  the  item.  The  answer  is  recorded  on  a  IS- 
place  ibm  sntwer  smcl 

In  part  II  there  U  a  section  of  a  standard  aviation  map  in  color  on 
each  page.  Each  map  is  sectioned  off  into  twelve  squares  labeled  A  :U 

through  L  consecutively.  Below  the  map  are  four  3  x  2>4-inch  aerial  i 

photographs  of  portions  of  the  area  portrayed  in  the  map.  (The  scale  of  j  • 

the  photographs  is  10  times  that  of  the  tm|x)  The  answer  to  an  item  is 
the  letter  of  the  square  on  the  map  containing  the  photographed  area.  ;  J 

(1)  Internal  characteristic*. — Part  I  contains  1  sample  problem  and 

49  scored  items  based  on  9  large  aerial  photographs.  Figure  2CJ  b  a  j 

sample  hem  of  part  I. 

Part  II  contains  2  sample  items  and  SO  scored  items,  based  on  13 
aerial  maps.  There  arc  six  sections  in  part  II,  each  composed  of  a  double 
page.  E»eh  page  contains  an  aerial  map  and  four  aerial  photographs  to 
go  with  it  Figure  20.8  is  an  illustration  of  a  sample  proWcm  in  part  II. 

(2)  Administration. — In  the  administration  of  part  1,  the  task  and 
sample  item  are  explained.  Item  1  is  worked,  recorded,  and  explained 
before  the  examinees  start  the  test.  Five  minutes  testing  time  b  allowed 
for  part  I,  and  administration  takes  approximately  5  minutes. 

Ia  part  II  U*  task  is  described,  the  conventions  of  map  representations 
are  explained,  and  the  two  sample  hems  are  explained  in  detail  Three 
minutes’  working  <imt  is  allowed  for  each  double-page  section,  at  the  end 
of  which  the  examinee  b  told  to'  tuns  die  page  and  continue  with  the 
next  item.  The  testing  time  is  18  minutes,  while  explanations  and  Erec¬ 
tions  require  approximately  12  minutes,  bringing  the  total  time  to  30 


•  M  Imm  iwifcH  S  OSn  W  S,  AW  t,nii>,  HnAa—rWrv  AAV.  CM  wtrita 
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(3)  Scoring.~R~\>l/S  was  the  usual  scoring  formula  used,  but  on 
occasion,  for  convenience,  2R  —  2\V/'5  Was  used.  After  August  1944, 
the  scoring  formula  for  parts  I  and  II  was  R— W +20.  Very  late  in  the 
program  it  was  found  that  validity-maximizing  formulas  were  R— 3JW 
for  form  CP501H,  and  R  —  1.3W  f«,r  CP503B.  These  formulas  were 
based  on  the  statistical  results  yielded  by  a  sample  of  3,055  classified 
pilots  in  class  43H.  The  effect  of  utilizing  this  selected  sample  is  un¬ 
known. 

Statistical  results  for  Spatial  Orientation,  CP501B.  (1)  Distribution 
statistics. — Typical  examples  of  distribution  statistics  are  given  in  table 
209. 


Tabu  20.9.—  Distribution  constants  for  Spatial  Orientation,  CP301B 
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(2)  Reliability  coefficients. — Three  samples  yielded  the  estimates  of 
reliability  given  in  table  20.10. 


Tabu  20.10.—  Estimated  reliability  coefieients  for  S folia!  Orientation,  CPSOiB , 
based  upon  samples  of  mu  Us  si  fed  aviation  students 


M 

ru 

It) 

Tm-rtwa1  . . . . 

X69 

41 

f) 

711 

....IV  . 

MM 

1  mfOrnm  It 

Part  l-F»ri  J|» . . . 

X« 

1 1 _ .-i  ■ — = - — n — r* 

^r-nr-r-r  r-.n 

•t  try*' 


ry,-WI«e<rii  Kcaearch  Uak  Xa.  X  Data  redacted  Aarfl  t*4A 

Xm  aHcr  aa*rwraatei>  M  On.  At  lltdttil  uA  Paychaiafital  XaaM«  Uak  X*.  t 

Agf,  Ir^Jt 

„  taf  t*p*gm*tml  ATMMI  iata  MMrMcty  ttaar4  halve*.  TcM«l  at  UetHcal  u4 

fatklne  Xaaaaatat  Uatt  X*.  X  Data  rented  May  IXL 


(3)  tectorial  composition,  CPSOIB. — The  only  important  loading  of 
this  test  is  in  the  perceptual-speed  fact.tr  (0.62).  The  next  highest  load¬ 
ing  is  in  the  psychomot  or- speed  factor  (0.21).  The  comnKmality  is  0.69. 
so  all  the  nonerror  variance  of  the  test  is  known.  Foe  a  fuller  picture  of 
the  factorial  composition  of  the  test,  see  appendix  B. 

(4)  Test  validity. — Typical  validation  results  based  on  several  sam¬ 
ples  are  given  in  tables  20.1 1  and  20.12. 

Evaluation. — With  the  test-rrtest  technique,  a  satisfactory  but  mod¬ 
erate  reliability  was  found. 
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rtandard  deviation  not  reported. 

Highly  unreliable  criterion. 
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Since  this  test  was  included  in  several  analyses,  considerable  factor- 
analysis  data  are  available.  The  weighted  averages  of  these  data  show 
that  69  percent  of  the  total  variance  of  this  test  has  been  accounted  for  v 
by  common  factors.  The  perceptual-speed  factor  contributes  38  pciccnt, 
and  the  remaining  variance  is  scattered  over  several  other  abilities,  not 
exceeding  4  percent  in  any  one  factor.  The  test  is,  therefore,  a  relatively 
pure  measure  of  perceptual  speed,  but  it  is  outstripped  for  this  purpose 
by  Speed  of  Identification,  CP610A.  by  a  small  margin.  Since  the  pilot 
validity  of  ‘his  test  estimated  from  known  factor  loadings  is  very  close 
to  its  obtained  validity,  and  since  the  communality  almost  exactly  equals 
the  reliability  (tcst-rctest  estimate),  it  can  be  assumed  that  all  of  its 
factors  valid  for  pilot  training  have  been  accounted  for. 

The  validity  figures  for  this  test,  based  upon  several  criteria,  are 
varied.  The  validity  for  pilots  in  primary  training  is  significant,  but  not 
impressive.  The  validity  is  considerably  higher  for  navigators.  For  the 
unreliable  bombardier  criteria,  validity  is  exceedingly  low,  and  for  radio 
operator-mechanics,  it  is  nil.  Uncorrectcd  validities  for  WASPs  arc  the 
highest  of  all  (rn,=0.36  ami  0.40),  but  the  samples  are  too  small  to 
permit  a  conclusive  interpru  on. 

Statistical  result ;  for  Spatial  Orientation,  CP503B.  (1)  Distribution 
statistics. — Typical  examples  of  distribution  statistics  are  given  in  table 
20.13.  The  distribution  curves  are  positively  skewed  and  somewhat  flatter 
than  normal. 

Table  20.13. —  Distribution  constants  for  Spatial  Orientation ,  CP50SB 

(Scored  R-W/5) 


Unclassified  aviation  students* .  3,000 

»°*  .  MSI 

,  Do»  . .  1,920 

Armorer*  in  t?amir,j»  .  376 

Aviation  mechanic*  ir  training* .  232 


'Tested  in  December  1942  at  Psychological  Research  Units  Nos.  I,  2,  and  3. 

•Tested  in  October  1942  at  Psychological  Research  Unit  No.  1. 

’Tested  at  Medical  and  Psychological  Examining  Units  Nos.  4  to  10  with  the  November 
1943  Classification  Battery. 

‘Tested  with  the  December  194?  Classification  Battery.  In  training  at  Imwry  Field. 

•Tested  at  Psychological  Research  Unit  No.  2  with  the  December  1942  Classification  Battery. 

(2)  Reliability  coefficients. — Two  samples  yielded  the  estimates  of  re¬ 
liability  given  in  table  20.14. 

Table  20.14. —  Estimated  reliability  coefficients  for  Spatial  Orientation,  CPS03B, 


based  upon  samples  of  unclassified  aviation  students 

H 

Type  rt„ 

712 

Tcst-rctest*  . .  .... 

0.69 

IBS 

Do* . . 

.69 

•  Retest  after  approximately  30  days.  At  Medical  and 
Psychological  Examining  Unit  No.  6  from  April  II  to 
19,  1945. 

•Retest  after  12  days  for  most  of  the  iroup,  although 
for  a  few  it  was  6  dsys,  and  for  a  few  others  35 
days.  Tested  at  Psychological  Research  Unit  No,  2.  Data 
reported  in  April  194!, 
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(3)  factorial  composition.-- The  most  significant  loading  is  in  the 
jK-rccptual -speed  factor  (0.54).  The  next  highest  loading  is  in  the  vis- 
ualization  factor  (0.25),  and  no  other  loading  exceeds  0.20.  The  com* 
niunality  is  0.53. 

(4)  Test  validity. — Typical  validation  results  based  on  several  samples 
arc  given  in  tables  20.15  and  20.16. 

I ival ivii lull  — Spat  ial  Orientation,  CP50SB. — As  evidenced  by  the  mean 
scores,  this  tout  is  fairly  difficult.  Its  scores  have  a  considerable  degree  of 
disjx.-rsion,  thus  indicating  good  discrimination  between  examinees.  Relia¬ 
bility  is  satisfactory  for  a  battery  test. 

The  weighted  averages  of  the  factor  loadings  in  this  test  indicate  that 
53  percent  of  its  total  variance  has  been  accounted  tor  by  common  fac¬ 
tors.  Of  the  total  variance,  the  perceptual-speed  factor  contributes  29 
percent,  the  visualization  factor  6  percent,  and  several  other  factors  not 
more  than  2  percent  each.  Better  tests  arc  available  for  both  the  leading 
factors. 

The  validity  of  this  test  for  pilots  in  primary  training  and  for  naviga¬ 
tors  is  moderate.  For  bombardiers  the  validity  is  low,  and  for  radio  oper¬ 
ator-mechanics,  it  is  probably  zero.  The  obtained  pilot  validity  (0.26) 
almost  coincides  with  that  predicted  from  known  factorial  content  (0.24), 
so  there  is  tin  need  to  examine  the  test  further  for  unknown  sources  of 
pilot  validity. 

Tariathns  of  Spatial  Orientation  Tc'ts.-  Previous  to  construction  of 
forms  CP501B  and  CP503B  of  Spatial  Orientation,  forms  CP501A, 
CP502A,  and  CP503A  were  used  in  the  classification  battery,  as  three 
parts  of  a  single  test  booklet. 

The  administrative  directions  and  test  items  fo«-  CP501A  were  con¬ 
structed  in  the  same  manner  as  those  for  CP501B.  Hie  A  form  is  also  a 
“photo-photo’1  matching  test  and  is  composed  of  42  items.  Since  it  is  the 
least  difficult  of  the  three  test:,  it  was  administered  first. 

CP502A  is  a  “nap  map”  matching  test,  wherein  the  examinee  is  to 
recognize  the  area  of  a  large  aerial  map  which  is  depicted  by  a  smaller 
enlarged  map.  Administrative  directions  for  this  test  are  of  the  same 
type  its  those  used  in  the  ollur  Spatial  Orientation  tests.  There  arc  24 
items  in  the  test,  and  being  considered  second  in  difficulty,  it  was  ad¬ 
ministered  after  CP501B. 

CP503A  is  a  "map-photo”  matching  test,  which  is  made  up  of  a  small 
square  of  aerial  map  in  each  test  item  and  a  large  aerial  photograph  for 
each  six  tost  items.  The  examinee's  task  is  to  pick  out  the  area  in  the 
photograph  represented  by  the  map-area  in  the  test  i»em  Th.re  are  24 
items  in  the  test,  and  b:ing  the  most  difficult,  it  was  administered  last 
in  the  series. 

These  three  tests,  as  previously  indicated,  were  adn  inisccred  in  order 
of  difficulty.  The  fact  that  they  are  arranged  in  reverse  order  in  the  test 
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Tamjc  20.15. —  Validity  data  for  Spatial  Orientation,  CP503B,  ( Scored  R-WJS),  using  graduation-elimination  criterion 
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booklet,  however,  caused  administrative  difficulty.  The  validity,  distribu¬ 
tion,  and  reliability  data  of  these  three  tests  compared  favorably  with 
the  two  forms  that  replaced  theim  Since  the  "map-map"  test  is  unique 
in  this  set  of  tests,  sample  validity  data  might  be  of  interest.  For  a  sam¬ 
ple  of  1,282  pilots  in  class  43A  at  Kelley  Field,  68  percent  of  whom  were 
graduates  from  primaiy  training,  the  uncorrected  validity  biscriat  corre¬ 
lation  was  0.17.  The  mean  score  of  graduates  was  6.41,  oi  eliminees 
5.56,  and  the  standard  deviation  of  both  groups  combined  was  3.15.  For 
185  navigators  in  classes  42-10  to  42-17,  Southeast  Training  Center,  78 
percent  of  whom  were  graduated,  the  uncorrected  validity  was  0.28.  The 
mean  score  of  graduates  was  3.1;  of  eliminees  2.0;  and  the  over-all 
standard  deviation  was  2.3. 


Aerial  Landmarks,  CF525A ' 

In  this  test,  an  attempt  was  made  to  simulate  the  task  so  frequently 
performed  by  pilots,  navigators,  and  bombardiers,  of  identifying  from 
the  air,  landmarks  previously  seen  in  photographs  taken  from  a  different 
direction  and  height.  Observations  in  comtiat  and  training  indicated  that 
air-crew  members  frequently  fail  in  this  task.  Target  identification,  in 
particular,  was  a  crucial  part  of  the  bombing  mission  in  which  weakness 
of  this  sort  was  brought  dramatically  to  the  supervisor's  attention. 

Description. — The  test  consists  of  a  series  of  photographic  presenta¬ 
tions.  Each  presentation  consists  of  two  aerial  photographs  on  opposite 
pages.  On  the  left-hand  page  is  given  a  vertical  aerial  view,  called  “the 
reconnaissance  photograph,”  taken  from  approximately  10,000  feet  above 
ilic  ground.  Certain  |>oints  on  this  photograph  arc  encircled  and  numbered, 
thus  comprising  the  item  numbers  of  the  test.  On  the  right-hand  page  is 
presented  the  second  photograph,  designated  the  “cockpit  view.”  This 
photograph  was  taken  from  a  lower  altitude,  2,000  to  5,000  feet  above 
the  ground  and  from  an  oblique  angle.  Letters  A  through  O  appear  at 
various  points  on  the  oblique  photograph.  The  examinee's  problem  is  to 
locate  the  landmarks,  which  were  numbered  in  the  reconnaissance  photo¬ 
graph,  in  the  oblique  cockpit  view.  When  the  landmark  has  been  identi¬ 
fied,  the  letter  nearest  the  cockpit  view  is  recorded  as  the  answer. 

( 1 )  / ntcrnal  characteristics. — There  are  1 1  photographic  presentations 
as  described  and  55  items  within  the  test  booklet.  The  first  photographic 
presentation  and  first  five  items  are  used  as  practice  problems.  Sec  figure 
20.9  for  a  sample  item. 

(2)  Administration. — The  examinees  arc  told  that  their  task  is  to 
locate  landmarks  in  a  target  area.  The  photographic  presentations  are  ex¬ 
plained  to  them,  and  they  are  told  how  to  select  the  correct  answer.  The 
examinees  are  then  given  time  to  do  the  first  five  items  and  record  their 
resjxmses.  As  they  finish  each  practice  item,  they  are  told  the  correct 
answer  for  that  particular  item.  Two  minutes  are  allotted  to  complete  the 


•  Develop'd  at  P«jrchoUrk*I  Rraorch  Unit  No.  J  Ckirf  cofliriWao:  Cap*.  Stuart  W. 
Cook,  T«a./Sft.  Paul  C  Dana,  and  Pit.  Charlca  W.  kriaoa. 
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items  in  each  photographic  presentation.  Administration  requires  approx¬ 
imately  12  minutes,  bringing  the  total  testing  time  to  approximately  32 
minutes. 

(3)  Scoring  —The  scoring  formula  is  R— W/5. 

Statisticc  ■  (1)  Distribution  statistics. — A  sample  of  500  un¬ 

classified  aviation  students,  tested  at  Psychological  Research  Unit  Xo.  3 
in  Septemlier  1944,  yielded  a  mean  score  of  24.4  and  a  standard  devia¬ 
tion  of  8.4. 

(2)  Internal  consistency. — The  degree  of  homogeneity  of  the  items 
is  indicated  by  a  mean  internal-consistency  phi  of  0.37,  a  standard  devia¬ 
tion  of  the  phi  distribution  of  0.12,  and  a  range  of  values  from  0.00  to 
0.56.  These  statistics  are  based  upon  analysis  of  the  resjxjnscs  of  the 
highest  27  percent  and  the  lowest  27  percent  of  a  group  of  750  unclassi¬ 
fied  aviation  students  tested  at  Psychological  Research  I'nit  Xo.  3  in 
September  1944. 

(3)  Reliability  coefficient. — For  a  sample  of  500  unclassified  aviation 
students  tested  in  SeptcinlKT  1944  at  Psychological  Research  I’nit  Xo.  3, 
a  correlation  of  0.70  was  found  Itcnvecu  forms  A  and  B  of  this  test. 
Forms  A  and  B  arc  comparable  as  to  content.  This  value  may  therefore 
be  taken  as  the  reliability  of  either  form. 

(4)  Difficulty. — Bases  1  upon  item  analysis  of  the  responses  of  the  sam¬ 
ple  of  750  unclassified  aviation  students  mentioned  above,  the  test  yielded 
a  mean  jvojKirtion  of  correct  rcsjxmscs  of  0.52,  corrected  for  chance, 
with  a  range  f rom  0.08  to  0.89  and  a  standard  deviation  of  0.24. 

Evaluation. — This  test  has  a  fair  degree  of  homogeneity  and  moderate 
reliability.  Its  items  arc  reasonably  difficult.  Owing  to  the  lack  of  ade¬ 
quate  statistical  data,  no  further  evaluation  is  possible. 

Variations  of  slerial  Landmarks. — Form  CP525B  is  constructed  in  the 
same  manner  as  CP525A  and  contains  the  same  number  of  items.  It  is 
simply  an  alternate  form  or  a  continuation  of  form  CP525A.  It  was  con¬ 
structed  for  the  purposes  of  providing  a  wider  selection  of  items  and 
determining  reliability.  , 

Form  CP525C  also  contains  55  items,  and  it  is  constructed  and  admin¬ 
istered  in  ‘V  same  manner  as  Forms  A  and  I*.  The  items  in  Form  C  are 
those  items  m  A  and  B  which  possess  the  highest  degree  of  homogeneity 
as  revealed  in  item  analyses. 

Star  Identification,  CP519B  " 

One  of  the  prime  requisites  of  the  celestial  navigator  is  the  ability  to 
orient  himself  to  significant  stars  and  star  patterns.  Xot  only  must  he  be 
able  to  recognize  stars  and  star  (>atterns,  but  he  must  do  so  rapidly  and 
from  any  position.  These  considerations  led  to  the  construction  of  a  test 
that  would  measure  the  ability  to  recognize  sjvcific  stars  and  star  forma¬ 
tions  accurately  and  rapidly. 

*M>r*rIop«t  at  Mtdical  an.)  P.jrcholo»if al  Examiniitt  Unit  Xo.  9.  Cfcief  contributor:  Cape 
Sidney  M.  Adam*. 
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FIGURE  2u9 

SAMPLE  ■  RECONNAISSANCE"  AND  "C0CKP' ‘ 

GRAPHS  OF  AERIAL  LANOMARKS,  CP323A 


COCKPIT  VlCTf 


Ih\uri ['hoi i. — On  a  separate  sheet,  two  full-page  schematic  sky  maps 
arc  presented.  (See  fit;.  20.10.)  Fur  each  item  in  the  test,  the  examinee 
is  required  to  locate  a  single  star  which  is  designated  hy  number  on  the 
map.  This  number  also  establishes  the  item  niimlier.  A  test  item  in  the 
booklet  is  made  up  of  a  square,  which  is  an  enlarged  portion  ot  a  sky 
map,  and  which  is  divided  into  four  parts  by  two  intersecting  lines.  The 
four  parts  are  labeled  A,  B,  C,  and  D.  Within  each  part  are  several  stars, 
some  of  which,  if  connected  with  other  stars  in  the  square,  wottld  make 
the  recognizable  pattern.  In  doing  an  item,  the  ernminee  first  observes 


APRIL  SKY  MAP 


FIGURE  20.  K) 

ONE  OF  THE  SKY  MAPS  OF  STAR  IDENTIFICATION  TEST 

CP5I98 

I  2 


FIGURE  20.11 

SAMPLil  ’JEMS  OF  STAR  IDENTIFICATION, 

CP5I9  B 


the  numbered  star  on  the  sky  map  and  its  position  in  the  pattern.  He 
thru  solves  the  problem  by  finding  the  designated  star  in  one  of  the  four 
sections  of  the  square  in  the  booklet.  Figure  20.11  slmws  two  sample 
items. 

Star  1  in  the  April  sky  map  is  in  section  C  of  the  small  map.  Star  2 
is  in  section  D  of  the  small  map. 

(1)  Internal  characteristics . — Items  1  through  25  are  presented  in  an 
April  sky  map,  while  items  26  through  47  arc  presented  in  an  Octol»er 
sky  map.  The  booklet  also  contains  one  practice  item. 

(2)  Administration. — After  the  practice  item  is  explained  to  the  ex¬ 
aminee,  he  is  given  time  to  work  the  first  two  items.  He  is  given  the 
correct  answers  for  these  items  before  starting  the  test.  Twenty-five 
minutes  arc  allotted  for  the  working  of  the  test,  while  administration,  as 
such,  requires  approximately  8  minutes, 

(3)  Scoring. — The  scoring  formula  is  R— W/3. 

Statistical  results.  (1)  Distribution  statistics. — A  sample  of  110  pilots 
tested  in  September  and  October  1>44  at  Psychological  Research  Unit 
No.  3  yielded  a  mean  score  (rights  only)  of  33.0  and  a  standard  devia¬ 
tion  of  8.9. 

(2)  Internal  consistency. — The  degree  of  homogeneity  of  the  items 
is  indicated  by  a  mean  internal-consistency  phi  of  0.30.  a  standard  devia¬ 
tion  of  the  phi  distribution  of  0.15,  and  a  range  of  values  from  0.00  to 
0.76.  These  statistics  arc  based  upon  the  responses  of  the  highest  27  per¬ 
cent  and  the  lowest  27  percent  of  a  group  of  750  classified  pilots  tested 
in  Octolicr  1944  at  Psychological  Research  Unit  No.  3. 

(3)  Difficulty. — Rased  upon  the  responses  of  the  same  sample  of  750 
classified  pilots,  the  test  yielded  a  mean  proportion  of  correct  responses 
of  0.78.  corrected  for  chance,  with  a  range  from  0.33  to  1.00  and  a  stand¬ 
ard  deviation  of  0.16. 

(4)  Test  validity. — Although  the  test  was  designed  for  selection  of 
navigators,  the  only  validation  data  yet  available  are  for  pilots  (see 
table  20.17). 


Tahj:  20.17.—  Validity  data  for  Star  Identification,  CP5t9B,  bated  ufiom  a  tamfle 
of  pilot t  in  primary  training graduation-elimination  criterion 
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Evaluation. — The  mean  internal-consistency  phi  of  0.30  shows  a  fair 
degree  of  homogeneity,  while  the  mean  proportion  of  correct  responses 
of  0.78  indicates  that  the  test  items  are  much  too  easy.  This  test  is  ap¬ 
parently  not  valid  for  pilot  training,  but  if  it  docs  prove  to  have  validity 


unique  for  navigators,  it  would  be  an  excellent  classification  test,  since 
it  would  be  weighted  for  one  assignment  but  not  for  the  other.  Items 
with  a  greater  degree  of  difficulty  would  improve  the  test.  The  markedly 
superior  validity  of  the  error  score  in  the  small  sample  of  pilots  directs 
attention  to  the  need  for  further  study  of  the  separate  functions  of  cor* 
reel  and  incorrect  responses. 

EVALUATION  AND  CONCLUSION 

With  the  exception  of  Spatial  Orientation,  the  tests  discussed  in  this 
chapter  require  extensive  additional  research  in  order  to  evaluate  them 
adequately.  Research  on  most  of  these  tests  has  thus  far  been  concerned 
primarily  with  pilot  validity.  Since  orientation  tests  as  a  class  possess 
face  validity  for  bombardiers  and  navigators,  it  would  be  important  to 
validate  such  tests  against  bombardier  and  navigator  criteria.  More  ex* 
tensive  factor  analysis  of  orientation  tests  would  supply  badly  needed 
information. 

The  Spatial  Orientation  tests  I  and  II  were  used  in  the  classification 
battery  for  over  3  ycars  and  have  proved  to  be  effective  instruments, 
but  their  importance  is  in  their  measurement  of  the  perceptual-speed 
factor,  and  not  in  the  measurement  of  a  primary  orientation  ability. 
Compass  orientation  tests  liave  demonstrated  substantial  validities,  par¬ 
ticularly  for  pilots,  without,  however,  exhibiting  any  new  valid  com¬ 
ponent.  Results  on  the  other  orientation  tests  have  not  been  carried  far 
enough  to  determine  whether  or  not  they  have  any  unique  features  to 
offer. 


CHAPTER  TWENTY-ONE 


Tests  of  Set  and  Attention1 


INTRODUCTION 

Attention  In  Aviation 

Job  descriptions  of  air-crew  positions  stress  the  multiplicity  of  detail 
to  which  the  pilot  must  attend  in  observing  the  dials  and  indicators  and 
in  operating  the  many  switches  and  levers  in  the  cockpit  of  a  military 
airplane  in  the  proper  timing  and  sequence.  This  is  true,  although  to  a 
less  extent,  of  the  bombardier  and  navigator,  who  must  also  concentrate 
upon  their  tasks,  often  under  conditions  of  great  stress. 

Job-analysis  data. — Reference  to  chapter  1  will  show  how  important, 
relatively,  attention  is  regarded  in  aviation,  cither  in  training  or  in  com¬ 
bat.  In  a  study  of  the  causes  of  elimination  of  1,000  pilots  from  primary 
training,  insufficient  ability  to  divide  attention  was  mentioned  in  41  per¬ 
cent  of  the  cases.  This  deficiency  also  existed  in  43  percent  of  100  cases 
in  a  study  to  determine  the  reasons  for  dimi  ution  from  advanced 
training. 

Of  112  instructors  who  were  asked  to  indicate  the  causes  of  elimina¬ 
tion  of  232  students  from  navigator  training,  68  percent  checked  “In¬ 
ability  to  concentrate  effectively  over  prolonged  periods  of  time”  (during 
examinations  and  flights). 

Supervisors  of  fighter  and  bombardment  groups  indicated  on  a  9-point 
scale  the  minimum  amount  of  a  number  of  psychological  traits  they  be¬ 
lieved  necessary  for  the  successful  completion  of  the  pilot’s  combat  mis¬ 
sion.  Division  of  attention,  with  the  mean  ratings  of  7.5  and  6.8  re¬ 
spectively,  was  tied  for  fifth  rank  in  a  list  of  20  traits  for  fighter  pilots 
and  was  fourth  from  the  top  for  bon;b~r  pilots. 

Supervisors  of  comlnt  bombardiers  and  navigators  made  similar  rat¬ 
ings  for  these  two  jobs.  In  both  cases  division  of  attention,  with  a  mean 
rating  of  6.8.  was  in  eighth  place  from  the  top  rank. 

Previous  military  studies  of  attention. — Most  studies  of  attention  dur¬ 
ing  World  War  I  were  doi>c  by  Italian  psychologists.  A.  Cemelli  (2) 
believed  that  the  |>ilot  must  possess  both  depth  of  Mention  and  ability 
to  attend  to  several  stimuli  simultaneously.  He  measured  depth  of  at¬ 
tention  by  exposin,  a  small  figure  for  a  short  period  of  time.  The  can¬ 
didate  was  given  successively  longer  exposures  until  able  to  perceive  the 
figure.  Records  were  kept  of  the  time  of  exjmsure  required  for  a  correct 
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response.  Candidates  failed  this  test  if  the  time  of  exposure  required  for 
simple  figures  was  greater  ihan  six-tenths  of  a  second. 

In  order  to  measure  breadth  of  attention,  five  small  figures  were  ex- 
posH  simultaneously  fc^  successively  longer  periods  until  all  five  (dif¬ 
ferent  ones  in  each  exposure)  were  reported  correctly.  The  qualifying 
score  was  5  figures  correct  after  one-tenth  of  a  second  exposure.  No  at¬ 
tempt  was  made  to  relate  test  results  to  flight  performance. 

F.  U.  Safliotti  (3)  describes  briefly,  and  not  very  clearly,  an  attention 
test  that  is  apparently  of  the  cancellation  variety.  The  materials  con¬ 
sisted  of  5  scries  of  20  symbols  each.  Relevant  symbols  were  so  dis¬ 
persed  that  a  labyrinth  was  formed  through  a  mass  of  irrelevant  sym¬ 
bols.  The  subject  crossed  out  the  appropriate  symbols  in  going  through 
the  mass.  Errors  and  time  were  recorded. 

Dotting  tests  were  also  used  as  measures  of  continued  maximum  vol¬ 
untary  concentration  of  attention. 

Types  of  Tests  to  Be  Considered 

The  tests  to  be  discussed  in  this  chapter  were  designed  to  measure 
(1)  sustained  attention,  (2)  attention  under  distraction,  and  (3)  change 
of  set 

TESTS  OF  SUSTAINED  ATTENTION 
Test  of  Attention,  CI659AX1  * 

This  test  is  i  rdcled  after  the  two  best  measures  of  attention  found  by 
Wittcnbom  (4)  in  a  factorial  analysis  of  a  battery  of  attention  tests.  In 
devising  items  for  these  tests,  Wittcnbom  formulated  the  following  re¬ 
quirements  : 

1.  The  performances  should  not  depend  too  much  upon  intellec¬ 

tual  level. 

2.  The  tasks  should  depend  to  as  small  a  degree  as  possible  upon 

content  and  knowledge. 

3.  The  tests  should  correlate  as  little  as  possible  with  factors  here¬ 

tofore  identified. 

4.  The  scores  on  the  test  should  dcj>end  to  a  large  degree  upon  a 

continuous,  sustained  application  of  mental  effort.  The  tests 

should  be  so  constructed  that  a  layman  might  say  they  required 

a  high  degree  of  concentration. 

In  an  attempt  to  devise  a  test  that  would  not  depend  too  much  upon  a 
particular  kind  of  knowledge,  material  that  was  familiar  to  all  persons 
was  chosen,  namely,  digits  and  letters  of  the  alphabet.  A  variety  of  tests 
was  tried,  and  the  kind  of  test  that  most  dependably  required  continuous 
concentration  seemed  to  be  tlint  patterned  somewhat  after  tests  ordi¬ 
narily  called  following-directions  tests.  The  items  arc  presented  one  a*  a 
time  at  a  rate  that  allows  little  opportunity  for  interpolated  activity.  The 

*1  Rrinrck  Unit  No.  3.  Cfcitf  contributor!:  Sft-  Roy  C.  An4«rton. 
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importance  of  continuous  and  concentrated  work  was  enhanced  by  using 
tasks  in  which  the  response  to  an  ensuing  item  is  in  part  conditioned  by 
responses  to  previous  items. 

Description. — In  part  I,  sets  of  three  numbers  each  are  presented 
orally  by  means  of  phonograph  records.  Two  responses  to  each  set  are 
possible,  depending  on  the  interrelationships  of  the  three  numbers.  The 
numbers  in  part  I  were  recorded  at  the  speed  of  75  beats  per  minute. 
Each  number  is  spoken  on  a  heat  with  one  beat  skipped  between  sets. 
Precise  timing  was  accomplished  by  using  a  metronome,  which  was  placed 
behind  a  window  in  view  of  the  speaker,  but  out  of  sound  range  of  the 
microphone  as  the  record  was  being  cut. 

In  part  II,  words  are  spoken  that  begin  with  any  of  the  first  five  let¬ 
ters  of  the  alphabet.  Words  with  one  syllable  and  three  letters  are  desig¬ 
nated  short.  Words  with  two  syllables  and  five  letters  are  designated 
long.  Two  responses  are  possible  to  each  word,  depending  upon  the 
length  of  the  word  and  the  relationship  of  the  first  letter  to  the  first 
letter  of  the  preceding  word. 

(1)  Internal  characteristics. — Part  I  contains  a  sample  series  of  7  un¬ 
recorded  and  unscored  items,  and  a  practice  series  of  45  recorded  and 
utiscored  items  presented  by  phonograph,  followed  by  90  scored  items. 

Part  II  contains  a  sample  list  of  10  unrecorded  and  unscorcd  items,  a 
practice  list  of  15  recorded  hut  unscorcd  items,  and  a  practice  scries  of 
45  recorded  but  unscorcd  items  presented  by  phonograph,  followed  by 
90  scored  items.  Total  testing  time  is  45  minutes  for  parts  I  and  II. 

(2)  Administration. — Two  types  of  five-place  answer  sheets  arc  re¬ 
quired  for  Test  of  Attention,  CI659AX1.  In  part  I,  the  answer  spaces 
are  lettered  A  to  E.  In  part  II,  the  answer  spaces  are  numbered  1  to  5. 
It  was  felt  necessary  to  use  the  numbered  instead  of  the  lettered  sheet 
for  part  II  to  avoid  confusion.  Answers  in  part  II  depend  upon  the 
interrelationships  of  the  first  five  letters  of  the  alphabet.  If  answers  had 
to  be  given  in  terms  of  letters,  therefore,  the  task  would  become  -puri- 
ously  more  difficult. 

When  part  I  is  finished,  the  answer  sheets  for  that  part  arc  collected 
before  the  blanks  for  part  II  arc  distributed.  Following  arc  the  direc¬ 
tions  and  sample  items  for  jxirts  I  and  II.  The  words  in  italics  arc  spoken 
by  the  administrator  and  do  not  apjx-ar  in  the  printed  booklet  of  direc¬ 
tions. 

This  is  a  test  of  your  ability  to  concentrate.  Numbers  will  be  read  to  you  from  a 
phonograph  record,  three  at  a  time.  Your  task  is  to  listen  to  the  numbers  as  they  are 
read  and  then  to  blacke.vin  either  space  A  or  space  B  on  your  answer  sheet  accord¬ 
ing  to  the  following  directions  which  you  will  be  given  plenty  of  time  to  learn.  For 
each  set  of  three  numbers: 

'slacken  space  A  when: 

1.  The  first  number  is  smallest,  and  the  second  is  largest — or; 

2.  The  first  number  is  largest,  and  the  third  is  smallest. 

Blacken  space  B  in  all  other  cases. 

Note  the  sample  series  at  the  right. 
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FIGURE 

21.1 

SAMPLE  SERIES  OF  TEST  OF  ATTENTION, 
CI6S9AXI 


A  will  now  illustrate  the  zeay  each  series  n ill  be  introduced  and  read.  (Pause  one- 
half  second  Ik- tween  numbers  and  one  second  between  sets.)  The  next  item  is  number 
I:  2-6-5,  9-7-4,  3-5-8,  9-4-7,  8-6-4,  5-7-2,  1-8-3,  and  so  on  to  the  end  of  the  series. 
Then  the  next  series  of  15  items  will  be  introduced. 

In  item  .Vo.  1,  the  first  condition  exists;  the  first  number  is  smallest  and  the  sec¬ 
ond  lanjest,  so  the  space  under  letter  A  has  been  blackened.  (Pause)  In  item  No.  2, 
the  second  condition  exists;  the  first  number  is  largest  and  the  third  is  smallest,  so 
the  space  under  letter  A  has  been  blackened  for  item  No.  2.  ( Pause)  In  items  No.  3 
and  No.  4,  neither  condition  exists,  so  11  has  been  blackened  for  these  two  items. 
(Pause)  In  item  No.  5,  the  first  number  is  the  largest  and  the  third  number  is  the 
smallest,  so  A  has  been  blackened  after  5.  (Pause)  In  item  No.  6  neither  condition 
exists,  so  H  has  been  blackened  after  6.  (Pause)  In  item  No.  7,  the  first  number  is 
the  smallest  and  the  second  number  is  the  largest,  so  A  has  been  blackened  after  7. 
(Pause) 

You  will  haze  2  minutes  in  which  to  memorise  the  aboz’c  rules.  Go  ahead.  (After 
2  minutes.)  Turn  to  page  2. 

After  the  practice  series  the  instruction  is,  “Now  we  are  ready  to 
In-gin  the  test.  Start  with  item  No.  61.  Listen  carefully.” 

In  order  to  assure  that  no  one  has  lost  his  place,  the  voice  on  the  record 
tells  what  the  next  item  is  after  each  15  items.  For  example,  after  item 
75,  the  voice  says,  “The  next  item  is  No.  76.” 

At  the  conclusion  of  part  I,  1  minute  is  allowed  for  a  cleaning-up  of 
answer  shirts- -that  is.  erasing  or  adding  any  marks  that  were  not  filled 
in  properly  during  the  test.  Such  a  cleaning-up  p-riod  is  necessary  when 
answers  must  Ik-  recorded  very  puickly.  The  answer  sheets  for  part  I 
are  then  collected  and  those  for  |>art  II  distributed.  Following  are  the 
directions  ami  sample  items  for  part  II : 


Iii  curt  II  ot  the  test  you  will  listen  lo  a  series  of  words,  ami  mark  your  answer 
-heel  'icomliiij*  to  *vo  new  rules  Lists  o»  simple  ilircc-  ami  five-letter  words  will 
lie  na<l  to  you  (lotn  a  phonograph  record.  There  will  he  15  words  in  a  list.  You  must 
decide  whether  the  first  letter  of  a  word  comes  earlier  or  later  in  the  alphabet  than 
than  tile  fust  letter  of  the  word  before  it.  Only  words  beginning  with  the  letters  A, 
B,  C,  I),  or  E  w  ill  be  used. 

Three-letter  words  of  one  syllabic  wilt  be  considered  short  words. 

Fivc-Iciler  words  of  two  syllables  will  be  considered  long  words. 

Notice  on  your  answer  sheet  that  after  each  item  there  are  five  spaces,  numbered 
1,  2,  3,  4,  and  5.  Your  task  will  be  to  listen  to  each  word,  and  then  to  blackcn-in 
space  l  or  2  according  to  the  following  rules  which  you  will  lie  given  plenty  of  time 
to  learn:  \ 

Blacken  space  1 — 

A.  When  a  short  word  begins  with  a  letter  earlier  in  the  alphabet  than  the  first 
letter  of  the  word  before  it— or; 

11.  When  a  long  word  begins  with  a  letter  later  in  the  alphabet  than  the  first 
letter  of  the  word  before  iL 

blacken  space  2  in  all  other  cases.  Be  sure  to  blacken  space  2  for  the  first  word  in 
each  list  of  15  words. 

I-ook  at  the  sample  list  of  words  below. 

Sample  List: 

ditch  — Blacken  2.  Blacken  space  2  for  the  first  word  in  a  list, 
bat  —Blacken  1.  Short  word.  B  of  bat  comes  earlier  in  the  alpliabct  titan  d  of 
the  preceding  word  ditch. 

dizzy —Blacken  I.  Long  word.  L)  of  dizzy  comes  later  in  the  alphabet  than  b  of 
bat. 

car  — Blacken  2.  Short  word.  Rules  A  am!  B  do  not  apply, 
apple — Blacken  2.  Long  word.  Rules  A  and  B  do  not  apply, 
carry— Blacken  1.  Long  word.  C  conics  later  titan  a  of  apple, 
end  — 
bag  — 
catty  — 

Allow — 

1  luring  the  test  you  will  hear  the  list  of  words  introduced  and  read  to  you  as  fol¬ 
lows:  Begin  with  item  number  1.  (Pause  one  second  after  each  word):  Ditch — Bat 
—Dizzy— Ear — Apple— and  so  on  to  the  end  of  the  list  of  15  items.  Then  you  will 
be  told  to  start  with  item  16,  and  a  new  list  will  begin. 

(3)  Scoring. — The  scoring  formula  is  R— W+100. 

Statistical  results.  (1)  Distribution  statistics. — Typical  examples  of 
distribution  statistics  are-  given  in  table  21.1. 

Taiu.k  21.1. —  Distribution  constants  for  Test  of  Attention,  C 1 659 A X l  based  upon  t 

sample  of  610  I’ilols* 
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•  In  dm  4*4 II.  Toted  it  Psychologic*!  Koorch  Unit  N*.  )  in  K*bru»rf,  1944. 

(2)  Reliability  coefficient. — A  sample  ot  507  pilot  students  yielded  the 
estimates  of  reliability  given  in  table  21.2. 
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Table  2\. 2.— Estimated  reliability  coefficients  by  the  alltmalt-forms*  method  for 
Test  of  Attention,  C1659AX1  based  upon  a  sample  of  507  pilot j* 


Score 

ru 

Part  1  . 

0.70 

.00 

0.00 

.09 

•  In  cltu  44H.  TrtUd  at  PaychoJogical  Research  Unit  No.  J  in  February,  1944. 


(3)  Part-score  intcrcorrclation. — For  the  same  sample  of  507  pilots 
the  part  I-part  II  correlation  was  0.49. 

(4)  Difficulty. — Hascd  upon  item  analysis  of  the  responses  of  533 
pilots  in  cla.ss  4411,  the  test  yielded  a  mean  proportion  of  correct  re? 
sponscs  of  0.71,  corrected  for  cliance,  with  a  range  from  0.49  to  0.95 
and  a  standard  deviation  of  0.09. 

(5)  I ntcrcorr e lotions. — This  test  hits  never  been  factor-analyzed.  Its 
correlations  with  some  of  the  tests  in  the  classification  battery  are  shown 
in  table  21.3. 

Table  21  J. —  Correlations  of  Test  of  Attention,  C1659AX1  with  some  classification 
tests  based  on  282  classified  pilot s* 


Test  r 

Dial  and  Table  Reading,  CP621-622A  . 0.45 

Discrimination  Reaction  Time,  CP611D  .  .43 

Mathematics  A,  CI702E .  .34 

Reading  Comprehension,  CI614H  .  42 

Instrument  Comprehension,  CI616B  .  41 


>  In  tUuci  44G  and  44H.  Tested  in  February  1944  at  Psychological  Research  Unit  Mo.  X 

(6)  Test  validity. — Validation  results  based  on  a  sample  of  pilots  are 
given  in  table  21.4. 


Table  21.4. —  Validity  data  for  Test  of  Attention,  CI659AX1  based  upon 
graduation-elimination  of  610  pilots  from  elementary  training1 

(r#=0.B9> 


Score 

“a 

SD, 

rM» 

Part  I  . 

140.7 

137.1 

27.0 

0.07 

0.12 

Part  II  . 

14X0 

14XJ 

27.0 

.01 

.00 

Total  acore  . 

204.  i 

200.4 

40.0 

.04 

.10 

*  Assuming  an  uarcatrktrd  itanine  standard  deviation  o t  147. 


(7)  Item  ivlidity. — Validation  of  individual  items  of  this  test  dis¬ 
closed  the  results  recorded  in  table  21.5. 


Tabuc  21.5  —  I'alidity  of  items  of  Test  of  Attention,  C1659AXI,  based  upon 
graduation-elimination  of  532  pilots  from  elementary  training* 

(7, =047) 


Scare 

M0 

SDd 

Range  of  0 

U> 

High 

Part  1  . 

0021 

0452 

-0.14 

0.14 

Part  II  . 

.011 

404 

-.14 

.20 

*t«  duM  44M  oad  44 1.  Tested  at  Paychologiral  Rrocuch  Unit  No.  X 
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(8)  Effect  of  seating. — Since  the  items  are  ornjly  presented,  an  analy¬ 
sis  of  the  effect  of  placement  of  seats  was  completed.  No  significant  dif¬ 
ferences  in  scores  were  found  among  examinees  in  different  parts  of  the 
room. 

Evaluation. — A  satisfactorily  reliable  test  of  attention  was  modeled 
after  two  of  VVittenborn's  tests  of  attention.  Intercorrelations  with  tests 
in  the  classification  battery  suggest  verbal,  numerical  and  space  content, 
although  the  last  is  difficult  to  rationalize.  The  test  probably  lias  no 
unique  pilot  validity  and  lias  very  little  validity  attributable  to  known 
factors. 


Following  Directions  Test,  CP402A  * 


This  test  was  constructed  early  in  the  program  in  the  Office  of  the  Air 
Surgeon,  Headquarters,  Army  Air  Forres,  because  it  was  felt  that  die 
ability  to  carry  out  orders  and  follow  directions  is  important  t '■>  the  suc¬ 
cessful  completion  of  air-crew  training. 

Description. — The  entire  test  is  printed  on  a  standard  IBM  answer 
sheet.  It  consists  of  alternate  lines  of  text  and  answer  spaces.  There  are 
also  two  panels  of  dials  similar  to  those  found  in  the  cockpit  of  an  air¬ 
plane.  The  test  contains  instructions  for  filling  in  the  answer  spaces. 
These  instructions  are  constantly  altered  and  supplemented.  Figure  21.2 
illustrates  a  section  of  the  test. 


frcssurc .  Bconmnc  with  this  scmtcncc  am>  unto,  thc 

nun  rtm  tv.**i  rum  uua  uwa 

omecnoNs  tojl  you  to  stop,  whcncvca  thc  mist  woao 

txstM  hut  lasts  mu  ir.u  itui  »v»a  «mm  mo  him  ^ 

A  SCMTCNCC  DCGMS  WITH  ONC  Of  TNI  VOWCLS  A,  C  ,  I,  O. 

nra  mu  suss  I  ivu  iv/.i  »«i*  mo  mu  »wj  tiA*  ot%o« 

OH  U,  YOU  SHOULD  UNOCAL  INC  THAT  FIRST  WORD.  If  THC  OIL  IS 

ftui  iuv  mu  imi  mu  uui  mu  tain  tin;  twa  iml  iuu 

2  AT  A  MCHC*  TCMACAATUAC  THAN  THC  CYUNOCft,  UNOCALM  TNC 

-  ass\  .on  astaa  uui  uut  tats  iuu  tart* 


INI# 


V 


start 

PA  NO.  I 


ft  SCCONO  WORD  IN  THC  NCXT  SCNTCMCC;  OTMCHWASC  UNOCRUNC  THC 

^  ft  l--ll  r::n  rrrr  ir.-.-i  t art  rcr.\  aa.Tt  mw  tat.-: 

‘  a  THIRD  ANO  FOURTH  WORDS  IN  THAT  SCMTCNCC.  SlX  IF  IT  IS 

Q  my  it;ii  mu  in::  fix.’-,  r.v.t  szv»  i\w:  t/iat  mu  s' 

u  aftcr  ninc  o'clock  and  thc  air  jmo  is  uss  than  mo.  If 

iivj  t::;i  iu;,-  emt  f;is  uui  am*  me  mw  uv*  ifm  mm 

SO  THCN  UNOCOUNT  THt  5CG0N0  ANO  roUATH  WOAOS  KM  THC  NCXT 


"  mu 


tail  r.-/.* 


mu  t:iu 


mva  i/in  iv/a  iif/c  mm*  ••**% 

A  UAft*  UNOCOUNINO  AU 


MOURE  212 
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*  Dcnlictl  U  lit  OSti  •(  ill  Air  Sur|rM,  Hitlswit".  Atmj  Ait  f»»«t  Old  «»•- 
uii«iori:  u.  c*i.  m.  fhu  ni  sm*. 
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(1)  Administration. — The  introductory  directions  arc  printed  on  the 
IBM  answer  sheet  upside  down  from  the  rest  of  the  test  so  that  the  ex¬ 
aminee  will  not  get  a  preview  of  the  test  while  the  directions  arc  being 
read.  Ten  minutes  are  allowed  for  the  420  items. 

(2)  Scoring. — The  scoring  formula  is  R— W. 

Statistical  results.  (1)  Distribution  statistics. — Typical  examples  of 
distribution  statistics  arc  given  in  table  21.6. 


Tablz  21.6. —  Distribution  constants  for  Pollotving  Directions,  CP402A 


Group 

N 

M 

SO 

346 

20.7 

29.7 

12.4 

13.2 

266 

'Tested  in  April  tnd  Map  1942  at  Psychologies!  Research  Unit  No.  3. 


(2)  Reliability  coefficient. — For  a  sample  of  308  unclassified  aviation 
students,  tested  in  June  1942  at  Psychological  Research  Unit  No.  1,  the 
tcst-rctcst  correlation  (24  hours  interval)  was  0.75. 


Table  21.7.—  Validity  data  for  Following  Directions,  CP 402 A,  for  pilots  and 
navigators  (graduation-elimination  criterion) 


Group 

w. 

n 

mm 

M. 

SD, 

'*4. 

Pitoti  in  primary  triinioc1 

547 
2.6  SB 
1,942 

0.60 

SB* 

S.73 

1.9* 

0.0$ 

.13 

.01 

Pilot*  to  basic  training*  ... 
Pilots  in  primary  training* 

2.  SOS 

•  •  • 

SBHM 

■MM 

.11 

Navij£tora* . 

22S 

3S.9 

.26 

1*3 

.*4 

JS.2 

WMtim 

.24 

Do*  . 

367 

.90 

21.6 

■til 

Kl 

.43 

1  Uainf  staled  Korea  with  a  mean  of  6.00  and  a  standard  deviation  of  2.00.  Tested  at 
Pajrchological  Research  Unit  No.  3  in  April  1942. 

'In  class  4 JO.  Tested  at  Psychological  Research  Unit  Na  3. 

•Class  not  reported. 

•New  aviation  cadets  in  classes  42-10  through  42-13,  Southeast  Training  Center.  Tested  at 
P»/*hoI°*»eal  Research  Unit  No.  1  in  Februaty.  March,  and  April  194J. 

'Reclassified  pilots.  Classes  and  testing  data  as  in  footnote  ♦. 

*  New  aviation  cadets  and  reclassified  pilots  tested  at  Psychological  Research  Unit  No.  3 
from  April  I  la  August  14,  1942.  Clast  not  reported. 

(3)  Factorial  composition. — The  most  significant  loadings  were  found 
in  the  integration  II  (following-directions)  (0.54),  numerical  (0.25), 
verba!  (0.26),  and  visualization  (0.17)  factors.  The  communal ity  was 
0.54.  For  a  fuller  picture  of  the  factorial  composition  of  this  test,  sec 
appendix  B. 

(4)  Test  xvlidity. — Validation  results  based  on  several  samples  are 
given  in  table  21.7. 

Eivluation. — The  Following  Directions  Test,  CP402A,  defines  a  fac¬ 
tor  called  integration  II  in  the  analysis  of  the  integration  battery  (see 
ch.  10).  The  test’s  small  validity  for  pilots  in  primary  training  (mean 
of  0.11)  is  partly  unique,  since  it  appears  that  this  factor  lias  a  small 
loading  in  the  pilot  criterion.  The  test  has  moderate  to  high  navigator 
validity,  which  cannot  be  fully  accounted  for  by  the  facton  other  than 
integration  II.  The  test  is  worthy  of  further  development  unless  a  more 
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promising  tost  of  the  factor  can  he  found.  It  ap|*cars  to  l>c  some  kind 
of  a  set  or  attention  factor  and  may  possibly  Ik.*  identified  with  Witten- 
horn’s  attention  factor. 

TESTS  OF  ATTENTION  UNDER  DISTRACTION 
Attention  Test,  CP408A  4 

This  test  is  designed  to  yield  scores  for  a  numticr  of  traits:  Ability 
to  resist  distraction,  ability  to  observe  other  data  while  performing  a  t 

central  task,  concrete  vs.  abstract  imagery,  air-crew  interest,  and  confi-  • 

donee.  It  is  based  on  the  assumptions  that  the  need  to  perform  two  tasks 
in  a  time-span  too  brief  for  either  will  produce  distraction,  and  that  at¬ 
tention  is  directed  toward  certain  objects  rather  than  toward  others  as  a 
result  of  interest.  It  assumes  that  pilots  must  be  able  to  divide  attention, 
be  concrete-minded,  and  interested  in  airplanes  and  related  objects,  and 
that  navigators  must  be  able  to  resist  distractions,  be  abstract-minded,  i 

be  interested  in  scientific  objects,  and  be  confident  in  the  results  of  their 
own  arithmetic  when  working  under  pressure. 

Description.  (1)  Internal  characteristics, — The  test  consists  of  three 
units.  Each  contains  a  scries  of  arithmetic  problems,  comparable  to 
those  in  the  Numerical  Operations  Test,  0701 A,  intermingled  with  dis-  1 

traction  words  and  pictures  selected  on  the  basis  of  job  analysis  as  | 

typical  objects  involved  in  air-crew  activities  and  interests.  The  answers 
given  to  arithmetic  problems  are  checked  as  right  or  wrong  by  the  ex¬ 
aminee.  This  is  followed  by  a  recognition  test  for  the  distraction  mate¬ 
rial.  The  examinee  indicates  on  the  following  5-point  scale  whether  or 
not  the  item  mentioned  appeared  as  distraction  material  in  the  numerical 
oi>erations  section  of  the  test : 

I  f  you  think  the  item  appeared  and  you  arc  confident  of  the  fact,  mark  under  A.  . 

If  you  think  the  item  appeared  but  you  are  not  confident  of  the  fact,  mark  | 

under  B.  I 

If  you  think  the  item  did  not  appear  and  are  confident  of  the  fact,  mark  under  C  t 

If  you  think  the  item  did  not  appear  and  arc  not  confident  of  the  fact,  mark 
under  D. 

Use  E  only  in  those  few  cases  where  you  simply  cannot  mark  one  of  the  other 
answers. 

(2)  Figure  21.3  illustrates  a  section  of  the  test. 

(3)  Administration. — The  test  is  given  ill  two  sections.  Section  I  con¬ 
sists  of  numcrical-oi>crations  problems,  with  distraction  material  (pic¬ 
tures  and  words)  spaced  throughout  on  the  same  page.  Examinees  are 
informed  that  their  memory  for  the  distracting  materials  will  lie  tested. 

Section  II  contains  a  memory  test  on  the  distraction  material. 

(4)  Scoring. — The  following  are  the  formulas  for  the  various  scores: 
tor  confidence,  (R  +  \V)/(R— \V) ;  for  memory,  R— W;  for  the  inter¬ 
est  score  for  bombardiers,  pilots,  and  navigators,  R  —  \V;  for  the  numeri- 

'OtxUptd  *t  P»jrtboJofKil  Rtmnt  Unit  NV  I.  CVH  imIiiWmi:  U  J*ka  K.  Htafkl, 

Opt.  DvfuM  E.  S«9*r. 
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A. 121 

37  X  3  3  8.113 

C.lll 

A.  37  ^<2^* 

8.41 

C.39 


A.  109 

75  >34  »  &4I 


C.49 


FIGURE:  21.3 

ITEMS  OF  ATTENTION  TEST.  CP  408  A  .  AND  WORK-IN- 
FLIGHT.  CE4I5A 

cal  operations  score,  R— W/2;  for  Concrete  vs.  Abstract  Imagery, 
(R— W)/(R'— W'),  where  R  and  W  refer  to  correct  and  incorrect  re¬ 
sponses  for  pictures  (concrete  material),  and  R'  and  W'  refer  to  correct 
and  incorrect  responses  for  letters  (abstract  material). 

Statistical  results. — Results  are  presented  for  scores  on  confidence, 
memory,  pilot  interest,  navigator  interest,  numerical  operations,  abstract 
vs.  concrete  memory.  The  number  of  cases  in  the  bombardier  sample 
was  too  small  to  warrant  analysis. 

(1)  Reliability. — Reliabilities  of  the  various  scores  are  shown  in 
table  21A 

Tail*  21 A — Estimated  reliability  coefficients  ( odd-even  method)  for  the  various 
scores  of  the  Attention  Test,  CP40SA,  based  ufon  a  sample  of  465  unclassified 

aviation  student/ 


Store  rrt 

Memory  for  distraction  material . . . 0.91 

Arithmetic  under  distraction . A7 

Confidence  in  recall .  .97 

Concrete  memory .  JB9 

Abstract  memory . 91 

Navigator  interest  . 79 

Pilot  interest .  79 


1  Tested  in  June  1942  at  Psychological  Research  Unit  No.  I. 

(2)  Intercorrclations  of  various  scores. — Intcrcorrclations  of  some  of 
the  scores  of  the  Attention  Test,  CP408A,  arc  shown  in  table  21.9. 

Tail*  21.9. —  Intercorr Actions  of  certain  scores  of  the  Attention  Test,  CP40SA * 


Snm  ufrtUtW  r 

Arithmetic  vs.  memory . . . Q.1J 

Memory  vs.  confidence .  «M 

Concrete  memory  vs.  abstract  memory .  .46 

Pitot  interest  vs.  navigator  interest .  .46 


1  N  not  reported.  Probably  same  sample  as  in  table  21 A 
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(3)  Validity. — Tables  21.10  and  21.11  present  validity  data  for  the 
various  scores  of  the  Attention  Tcvt,  CP408A.  for  pilot  and  navigator 
samples. 

Tabus  21.10.—  Validity  data  for  the  various  stores  of  the  Attention  Test,  CP409A, 
for  a  sample  of  297  Pilots,  based  upon  graduation-elimination  from  elementary 

training* 


Scars 

*r 

SD, 

rM# 

SI.  70 
40.00 
14.  S# 
22.11 
21.  tJ 

SO.  SI 
29.  IS 
IS.04 
24.70 
21. tO 

17.02 
16.12 
7.09 
9.7  S 
144) 

040 

41 

-.04 

-.10 

40 

Memory  . . . 

Numerical  eperationa  . . . 

Abstr  act -concrete  memory  . 

1  Tested  at  Piycholofkai  Research  link  Na.  I.  Claaa  M  rtptrti 

Table  21.11. —  Validity  data  for  the  various  scores  of  the  Attention  Test,  CP408A, 
for  a  sample  of  61  navigators,  based  upon  graduation-elimination  from  navigator 


framing* 

|(7,*=04S)] 


Score 

M. 

5D, 

'..a* 

ConOdrncr  . 

M  42 

47.S4 

17.2S 

0.20 

Memory  . . . . . . 

41.71 

20.67 

19.44 

.09 

Navigator  interest . . . . . 

ii  is 

14.06 

7.02 

-.07 

Numerical  operations  . . . 

27.11 

24.66 

402 

40 

22.2S 

10.67 

12.05 

.10 

•Toitd  ii  Paycholofical  Research  Unit  Sfa.  I.  CUtt  Ml  rqwttT 
'  For  tkese  dau.  tlw  tUnuard  error  of  a  *cr»  Wriil  r  U  .30. 


Evaluation. — Tlie  most  promising  score  is  the  confidence  score,  for 
navigators.  While  the  group  is  very  small,  the  biscrial  correlation  of 
0.23  indicates  a  possibility  of  a  valid  measure.  In  addition,  in  view  of  the 
unusualncss  of  the  confidence  score,  such  validity  would  probably  prove 
to  be  unique.  The  nature  of  the  underlying  variance  is  still  unknown. 
The  first  step  would  be  to  verify  the  navigator  validity  in  a  much  larger 
sample.  The  unusually  low  navigator  validity  for  the  numerical  opera* 
tions  score  indicates  that  the  sample  may  not  be  typical. 

Work-In-FHght  Test,  CE415A  • 

The  Work-In-Flight  Test  is  a  shorter  and  simpler  nkxlification  of  the 
Attention  Test,  CP408A.  It  was  designed  as  a  test  of  ability  to  do  arith¬ 
metic  problems  under  conditions  of  distraction  and  pressure.  This  ability 
was  thought  to  be  important  to  the  navigator,  both  as  a  student  in  test 
Mights  and  as  an  officer  in  combat  and  other  missions.  Additional  diffi¬ 
culty  and  pressure  arc  added  to  the  arithmetic  task  by  (I)  scattering 
irrelevant  distracting  drawings  around  the  |»ge,  (2)  testing  the  exami¬ 
nee  for  his  recall  of  the  irrelevant  material  and  informing  him  that  he  is 
to  be  so  tested,  and  (3)  verbal  threat  in  which  the  examinee  is  told  of 
the  importance  of  the  test  to  air-crcw  work  and  of  the  poor  performance 
of  many  of  those  present. 

•  DcvrWprd  si  PtytB«loc*«al  Rrararck  Umi  Ns.  I.  Oh(  («*inW<wi:  U.  Jska  K.  KmM 
Cap*.  D«uU  L  Svprr. 
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It  should  lx*  pointed  out  tint  the  measurement  of  ail  aviation  student's 
ability  to  stand  up  under  pressure  is  rendered  difficult  by  his  status.  The 
very  fact  that  he  is  taking  rla  ifie.uion  tests  is  a  cause  for  worry  on  the 
pari  of  the  aviation  student,  as  has  been  demonstrated  by  iutei  views.  The 
desire  for  a  particular  tyj)e  of  assignment  (generally  pilot)  and  the  fear 
of  elimination  arc  widespread.  Contrasting  a  test  given  with  pressure 
and  without  pressure  in  this  situation  is  therefore  virtually  impossible; 
tests  without  pressure  are,  at  best,  only  tests  with  pressure  slightly  re¬ 
duced. 

Description. — There  arc  two  parts  to  the  test,  each  consisting  of 
arithmetic  problems  with  distraction  material,  followed  by  a  recognition 
test  for  the  distraction  material.  Directions  emphasize  the  need  for  speed, 
accuracy,  and  memory.  A  section  of  the  test  is  illustrated  in  figure  21.3. 

llfject  of  pressure. — The  Work-In- Flight  Test,  CK415A,  was  adminis¬ 
tered  to  three  samples  of  aviation  students  to  study  the  effect  of  pres¬ 
sure.  It  was  given  to  263  aviation  students  in  the  standard  maximum 
pressure  manner,  to  112  students  as  an  arithmetic  test  only,  with  the 
same  time  limits  as  in  the  standard  manner,  and  to  114  students  as  a 
memory  test  only,  with  the  standard  time  limits.  This  made  it  possible 
to  compare  (1)  mean  and  variability  of  arithmetic  and  memory  scores 
under  conditions  of  minimum  and  maximum  pressure  and  (2)  correla¬ 
tion  of  the  arithmetic  score  on  this  test  with  scores  on  the  simple  Nu¬ 
merical  Operations  Test,  CI701A,  for  minimized  and  maximized  pres¬ 
sure.  Minimum  pressure  consists  of  doing  only  one  of  the  two  tasks  of 
the  test  at  a  time,  under  the  surroundings  of  aviation -cadet  classification 
testing.  The  pictures  on  the  test  blank  are,  of  course,  still  present  and  may 
serve  as  a  source  of  distraction.  Maximum  pressure  involves  the  addi¬ 
tional  pressure  of  two  simultaneous  tasks  and  of  vcrltal  threats. 

Results. — The  effect  of  added  pressure  upon  the  means  and  standard 
deviations  is  shown  in  tabic  21.12. 


Tamie  21.12. — The  effect  of  pressure  on  the  means  and  the  standard  deviations  of 
scores  of  the  IVorh-tn-Hiyht  Test,  Cl.  1 15. 1,  out  ministered  to  163  aviation  students 
in  the  standard  manner,  112  aviation  students  as  an  arithmetic  test,  and  114  aviation 
aisation  students  as  a  memory  test 1 


Man 

SD 

Scttt 

Minimum 

Maximum 

Minimum 

Maximum 

pr  overt 

pr  c  v>  u  r  r 

pfCMiirt 

prcuurc 

AlttktlHliC  . 

in 

15.5 

5.1 

5.0 

Mcmir . 

52.1 

24* 

17.2 

13.4 

1  Tr.ifJ  <1  PiriUlotiol  Ktxirik  Unit  Si.  I,  iWiit  Mjjr.  1942. 


Hie  correlation  coefficients  between  .Numerical  Operations,  0701  A, 
and  live  arithmetic  scores  of  the  Work-in-Flight  Test,  CE415A.  with 
maximum  pressure  and  minimum  pressure,  respectively,  arc  073  and 
0.70,  using  the  samples  dcscrilied  nlmve.  The  reliability  (test-rcicst)  of 
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the  numerical  o|*rati„ns  test  is  0.81.  Ktider-Richardson  reliabilities  of 
0.73  for  tiie  arithmetic  score  witTi  minimum  pressure  ami  0.66  for  the 
arithmetic  score  with  maximum  pressure  were  found,  again  using  the 
samples  described  above.  The  correlations  between  Numerical  Operations 
and  the  two  arithmetic  scores  of  Work-in-Flight,  corrected  for  attenuation 
in  both  variables,  are  0.9a  and  0.97.  The  functions  measured  by  the  three 
tests,  therefore,  arc  almost  identical. 

Evaluation. — It  is  clear  that  pressure,  as  here  defined,  results  in  a 
definite  lowering  of  scores  on  both  arithmetic  and  memory  sections  of 
the  test  and  in  a  lessening  of  the  spread  of  scores.  This  is  especially  true 
of  the  memory  section.  The  arithmetic-computation  test  is,  for  all  prac¬ 
tical  purposes,  a  numerical  test,  with  either  maximum  or  minimum  pres¬ 
sure.  It  appears  that  the  distraction  material,  while  influencing  scores, 
does  not  introduce  new  variance  into  a  computation  test. 

Vocabulary  Pressure  Test,  CK20IA* 

This  test  attempts  to  measure  division  of  attention  under  pressure, 
'fhe  task  is  that  of  a  conventional  vocabulary  test.  Distraction  is  provided 
hv  having  the  student  count  the  number  of  auditory  signals  transmitted 
at  irregular  intervals  through  individual  head  phones. 

Description.  (1)  Internal  characteristics. — The  test  uses  the  7  equated 
vocabulary  scales  of  the  ACTC  Cooperative  Vocabulary  Test,  form  R  (4), 
of  30  items  each. 

(2)  Athninislration. — The  test  was  administered  in  a  code-instruction 
room.  While  doing  the  vocabulary  test,  the  students  were  required  to 
count  the  number  of  signals  transmitted  to  them  through  individual  head 
phones.  The  seven  scales  were  administered  as  follows:  (1)  unspeeded, 
6  minutes;  (2)  moderately  speeded,  4  minutes;  (3)  moderately  speeded, 
4  minutes;  (4)  speeded,  4  minutes,  with  a  distraction  task  added  (39 
irregularly  spaced  signals  were  transmitted  on  a  prearranged  time  sched¬ 
ule  controlled  visually  by  a  stop  watch) ;  (5)  the  same  as  part  4,  with 
28  irregular  signals;  (6)  excessively  speeded,  2  minutes,  with  instruc¬ 
tions  to  hurry  faster  given  in  an  urgent  voice  every  IS  seconds;  (7) 
the  same  as  part  6. 

(3)  Scoria f/. — Separate  scores  were  obtained  for  the  following  con¬ 
ditions:  (1)  normal  rate,  (2)  moderately  speeded  rate,  (3)  speed  and 
distraction,  (4)  excessive  speed,  and  (5)  distraction  (error  score). 

Statistical  results. — Critical  ratios  were  computed  for  differences 
among  the  means  of  scores.  None  were  significant.  In  addition,  the  fol¬ 
lowing  statistics  are  available. 

(1)  Distribution  statistics.  Tvpical  examples  of  distribution  statistics 
are  given  in  table  21.13. 

*  I)rvrlo|ir>t  At  l,'>y<.liolonicAl  Kcoicli  Unit  No.  1.  Chief  tontributon:  C»pt.  Frederick  D. 
T)avi*  amt  l.t.  Col.  l-attranre  I'.  Sliaftcr. 
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Tasle  21.13. —  Distribution  constants  for  Vocabulary  Pressure  Test ,  CE201A,  based 

upon  a  sample  of  353  pilot / 

I  Score  M  I  SD 


Normal  rat*  . 

20.6 

8.4 

Moderately  speeded  ... 

35.5 

15.1 

Speed  and  distraction  ... 

32.5 

14, ( 

Excessive  speed  . 

24.2 

11.2 

Distraction  (eiror  score) 

4.6 

4.2 

_ ! _ I _ I _ i _ 

■  ,’n  clast  42E,  Southeast  Training  Center.  Teste J  at  Psychological  Research  Unit  No.  I. 

(2)  Test  validity. — Validation  results  based  on  a  sample  of  pilots  are 
given  in  table  21.14. 

Table  21.14. —  Validity  data  for  Vocabulary  Pressure  Test,  CE201A,  based  upon 
gradua'ion-elimination  of  359  pilots  in  primary  training1 
(Pt—0.6S) 


Score 

M, 

SD, 

fll# 

Normal  rate  . 

21.08 

19.77 

8.35 

0.10 

Speed  . 

36.08 

34.48 

15.09 

.07 

Speed  and  distraction . 

32.23 

32.01 

14.78 

.03 

Excessive  speed  . 

24.82 

23.17 

11.81 

.09 

Distraction  (error  score)  . 

4.66 

4.60 

4.22 

.01 

a  In  class  42E,  Southeast  Training  Center.  Tested  at  Psychological  Research  Unit  No.  1. 

Evaluation. — This  test  was  unsuccessful  in  its  attempt  to  set  up  a 
stress  situation.  Performance  under  stress  conditions  did  not  differ  sig¬ 
nificantly  from  performance  tinder  normal  conditions.  None  of  the 
scores  had  a  significant  pilot  validity. 

Maze  Coordination  Test,  CM1 18A  1 

This  test  attempts  to  obtain,  with  a  printed  test,  results  comparable  to 
those  obtained  from  psychomotor  tests  of  two-hand  coordination. 

Description.  (1)  Internal  characteristics. — The  test  consists  of  three 
sheets  of  8*/j-iuch  by  10- inch  paper,  on  each  of  which  arc  mimeographed 
two  simple  maze  patterns.  After  a  practice  period  with  the  frst  sheet 
using  each  hand  separately,  the  scored  trials  arc  made  using  both  hands 
simultaneously,  tracing  one  maze  with  the  right  hand  and  the  other 
with  the  left  hand.  Sixty  seconds  arc  allowed  for  each  page.  Figure  21.4 
shows  a  page  of  this  test. 

(2)  Scoring. — The  score  is  the  number  of  maze  openings  passed 
through,  allowing  a  maximum  score  of  100  per  maze. 

Statistical  results.  (1)  Distribution  statistics. — A  sample  of  159  pilots 
yielded  a  mean  score  of  49.2  and  a  standard  deviation  of  9.5.  These  ex¬ 
aminees  were  tested  at  Psychological  Research  Unit  No.  1  in  July  1942. 

(2)  Reliability  coefficient. — By  the  alternate- forms  method  (second 
maze  vs.  third  maze),  an  estimated  reliability  coefficient  of  0.90,  cor¬ 
rected  for  length,  was  obtained.  This  figure  is  based  on  a  sample  of  579 
pilots,  tested  in  February  1943  at  Psychological  Research  Unit  No.  1. 

’  Dereloprd  «t  Psychological  Research  CJnil  No.  1, 
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FIGURE  21.4 

SAMPLE  ITEM  OF  MAZE  COORDINATION, 
CMI18A 


(3)  Test  validity, — Validation  results  based  on  two  samples  are  given 
in  table  21.15. 

Table  21.15. —  Validity  data  for  Maze  Coordination,  CM  118 A,  based  ufon 
graduation-elimination  from  primary  pilot  training 


N. 

M, 

M. 

SI>, 

'*1. 

M56 

0.66 

50.91 

46.4$ 

9.5) 

0.29 

■666 

.71 

84.5 

80.8 

16.2 

.1) 

•Toted  at  Psychological  Research  Uni*  No.  1.  Class  not  reported. 

•  In  classes  42H  and  421.  Tested  at  Psychological  Research  Unit  Ho.  I. 


Evaluation. — Intercorrelations  are  not  available  to  test  the  hypothesis 
that  this  test  is  comparable  to  a  psychomotor  test  of  two-hand  coordina* 


i 


lion.  It  is,  nevertheless,  a  reliable  test  with  moderate  validity  for  pre¬ 
dicting  graduation-elimination  from  primary  pilot  training. 

CHANGE  OF  SET  TESTS 

A  battery  of  changc-of-set  tests  was  constructed  by  the  Medical  and 
Psychological  Examining  Unit  No.  9,  Buckley  Fie'd,  Colo.  It  was  hy¬ 
pothesized  that  a  trait  of  flexibility  of  attention  contributes  significantly 
to  a  trainee's  success  or  failure  in  flying  training. 

Each  test  consists  of  three  parts :  Part  I  serves  to  establish  a  set  in  the 
examinee;  the  problems  in  part  II  are  solvable  either  by  the  method  de¬ 
veloped  in  part  I  or  by  a  simpler  method;  the  problems  in  part  III  are 
solvable  cither  by  the  methods  used  in  parts  I  and  II  or  by  a  still  sim¬ 
pler  mcth'vd. 

Internal-consistency  statistics  arc  not  presented  for  these  tests,  be¬ 
cause,  in  general,  they  would  be  meaningless,  since  it  is  the  logical  rela 
tionship  of  a  given  item  to  its  adjacent  items  and  to  the  whole  test  which 
justified  its  inclusion  within  the  test  (no  item  per  sc  purports  to  measure 
the  same  thing  which  the  test  as  a  whole  measures) .  Test-retest  reliabili¬ 
ties  are  also  suspect.  If  the  test  were  split  into  equivalent  forms,  or  an¬ 
other  form  of  the  test  were  administered  under  similar  conditions,  the 
effect  of  learning  (insight  into  the  test)  would  bias  any  estimate  of 
reliability. 

The  quantitative  scores  derived  from  these  tests  arise  out  of  the  indi¬ 
vidual  differences  in  the  number  of  problems  necessary  to  break  the 
original  set  and  to  approach  subsequent  items  without  the  predetermined 
set.  Once  the  examinee  has  learned  to  discard  his  original  set,  it  is  ex¬ 
pected  that  he  will  solve  all  subsequent  problems  in  the  most  efficient 
manner. 

Arithmetic  Problem  Solving  Test,  CI216A  • 

Description. — In  this  test  the  examinee  is  first  presented  with  seven 
arithmetic  problems  that  can  he  solved  only  by  the  use  of  a  comparatively 
complicated  procedure.  These  problems  are  designed  to  establish  a  set. 
Next  he  is  presented  with  a  scries  of  five  similar  problems  that  can  be 
solved  either  (1)  by  the  comparatively  complicated  procedure  or  (2)  by 
a  comparatively  less  complicated  procedure.  Next,  a  third  series  of  five 
similar  problems- is  presented  that  can  he  solved  either  (1)  by  the  com¬ 
paratively  complicated  procedure.  (2)  by  the  comparatively  less  compli¬ 
cated  procedure,  or  (3)  by  a  very  simple  procedure.  Of  the  last  ten 
problems,  five  may  he  solved  by  one  of  the  two  more  complicated  pro¬ 
cedures,  and  five  by  all  three  procedures.  Although  all  the  procedures 
are  correct,  mathematically,  the  simpler  procedures  are  more  economical 
and  efficient.  Thus  the  primary  purpose  is  not  to  test  for  arithmetical 
ability,  but  to  determine  the  procedures  utilized  in  solving  problems  fol- 

»  Developed  at  Medical  and  ISj’cholog.cal  Examining  Unit  No.  9.  Chief  contributor:  Stafl/Sgl. 
Milton  Rokeach. 
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lowing  the  initial  establishment  of  a  mental  set  by  means  of  similar 
problems.  * 

(1)  .■Ithninistmlion.  The  following  is  quoted  from  the  directions  to 
the  test : 

This  is  a  test  of  your  ability  to  work  arithmetic  problems  quickly  and  accurately. 
All  of  your  problems  on  this  test  are  based  upon  the  following  situation.  You  are  to 
imagine  that  you  are  an  army  doctor  in  Alaska.  You  have  just  received  a  shipment 
of  several  thousand  cubic  centimeters  (c'-.'s)  of  pneumonia  scrum  in  a  sealed  vessel, 
and  you  do  not  expect  another  shipment  for  some  time.  This  sealed  vessel  has  a 
small  faucet  so  that  the  serum  may  be  released  through  the  faucet,  but  no  scrum  can 
be  poured  back  into  the  vessel.  Your  problem  !s  to  administer,  with  the  smallest 
amount  of  waste,  various  quantities  of  :crtim  to  soldiers  stricken  with  pneu¬ 
monia  *  *  * 

Look  at  problem  2: 

Site  cf  containers  Serum  Weeded 

2.  17  cc.  37  cc.  6  cc.  8  cc. 

A.  31  cc. 

B.  29  cc. 

C.  37  cc. 

*  *  *  you  have  three  empty  containers  and  the  sealed  vessel.  The  capacity  of 
these  three  containers  is  17,  37,  and  6  cc.'s  respectively,  the  problem  being  to  get 
exactly  8  cc.’s  of  serum  from  the  scaled  scrum  vessel  •  •  *  with  a  minimum  of 
waste  *  *  *  The  correct  solution  is  to  fill  the  37  cc.  from  the  scaled  vessel,  pour 
off  17  cc.’s  from  the  37  cc.  container  by  filling  the  empty  17  cc.  container,  leaving 
20  cc.'s  in  the  37  cc.  container.  Then  pour  off  6  cc.’s  into  the  6  cc.  container  leaving 
14  cc.’s  in  the  37  cc.  container.  Again,  pour  off  another  6  cc.’s  into  the  6  cc.  container 
from  the  37  cc.  container  leaving  8  cc.’s,  which  is  the  desired  amount,  in  the  37  cc. 
container.  The  simplest  way  to  indicate  your  solution  is  thus:  37— 17“ 6— 

Since  you  have  drawn  off  37  cc.’s  from  the  sealed  vessel  and  have  used  only  8  cc.’s, 
29  cc.’s  have  been  wasted  *  ♦  *  Your  answer  should  have  been  B. 

The  reader  will  note  that  the  instructions  attempt  to  create  a  set  to 
solve  the  problems  by  the  formula  l>  a-2c,  where  a,  b,  and  c  refer  to 
the  first,  second,  and  third  containers.  The  first  five  test-items  can  be 
solved  only  in  this  way.  The  alternative  procedures,  which  can  be  used 
in  later  problems  arc  a-c  and  c  alone. 

The  time  limit  for  the  test  is  18  minutes. 

(2)  Scoring. — Since  it  would  be  impossible  to  machine-score  this  test 
if  the  examinee  were  simply  asked  to  indicate  bis  method  of  solution,  it 
was  necessary  to  introduce  the  idea  of  wastage  into  the  technique.  Since 
the  amount  of  serum  wasted  def»euds  ujjou  the  method  of  solution,  it  was 
possible  to  construct  a  multiple-choice  item  adaptable  to  machine-scor¬ 
ing.  The  score  is  the  number  of  right  responses,  i.  c.,  choosing  the  sim¬ 
plest  and  most  economical  procedure. 

Statistical  results.  (1)  Distribution  statistics. — A  sample  of  561  un¬ 
classified  aviation  students  (tested  at  Unit  No.  9)  yielded  a  mean  score 
of  16.2,  a  standard  deviation  of  7.8,  and  a  range  from  0  to  25.  The 
distribution  was  approximately  U-shajH-d.  Aliout  18  percent  of  the  exam¬ 
inees  obtained  scores  of  five  and  Mow,  indicating  that  they  never 
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changed  (heir  set.  Afxmt  26  percent  had  scores  of  24  or  25,  indicating 
lhat  they  did  shift  (heir  set  promptly. 

There  is  a  marked  increase  in  (he  incidence  of  correct  answers  from 
one  item  to  (he  next,  indicating  (hat  more  examinees  shift  their  set  and 
solve  the  problems  by  the  more  efficient  method  as  they  progress  within 
part  I!  and  part  III. 

Arithmetic  Speed  Test  I  and  II  (no  code  number)* 

Description. — Test  I  consists  of  seven  parts  of  simple  arhhmetic 
problems  involving  addition,  subtraction,  and  multiplication.  Part  I  con¬ 
sists  of  practice  problems  in  addition,  subtraction,  multiplication,  and 
division.  Part  II  consists  of  addition  problems;  part  III,  subtraction 
problems;  part  IV,  multiplication  problems;  and  parts  V,  VI,  and  VII 
consist  of  mixed  problems  of  addition,  subtraction,  and  multiplication. 

Test  II  is  contained  in  parts  VIII  and  IX.  Part  VIII  is  a  series  of 
simple  addition  and  subtraction  problems.  Part  IX  has  the  same  prob¬ 
lems,  but  the  examinee's  task  is  to  subtract  wherever  there  is  a  plus  sign 
and  to  add  wherever  there  is  a  minus  sign. 

(1)  Administration. — Twenty-five  seconds  are  allowed  for  the  45 
items  in  each  [>art  of  test  I.  Ninety  seconds  are  allowed  for  the  90  items 
in  each  part  of  test  II. 

(2)  Scoring. — The  change-of-set  score  for  test  I  is  the  algebraic  dif¬ 
ference  between  the  number  of  problems  worked  correctly  in  parts  II, 
III,  and  IV;  and  in  parts  V,  VI,  and  VII.  The  score  for  test  II  is  the 
algebraic  difference  between  the  number  of  problems  worked  correctly 
in  part  VIII  and  part  IX. 

Maze  Tracing  Speed  Test  (no  code  number)1* 

Description. — This  changc-of-set  test  consists  of  a  scries  of  mazes, 
the  examinee  being  required  to  trace  the  shortest  and  most  direct  path 
from  the  starling  point  to  the  finishing  point.  In  the  first  series  of  prob¬ 
lems  designed  to  establish  a  set,  the  only  pathway  is  a  left-going  and 
devious  pathway.  In  the  second  series  of  problems,  there  arc  two  solu¬ 
tions;  the  left-going,  devious  solution,  and  a  clearly  shorter  solution.  In 
the  third  series  af  problems,  there  arc  three  solutions;  the  left-going, 
dev.ous  solution,  the  clearly  shorter  solution,  and  a  very  short  solution. 
Figure  21.5  shows  a  maze  of  the  two-path  type  with  solutions  indicated. 

(1)  Administration. — Twelve  minutes  arc  allowed  for  the  29  items. 

Code  Deciphering  Test  (no  code  number)  n 

Description. — This  change-of-set  test  consists  of  three  parts.  In  parts 
I  and  II  the  examinee  is  taught  two  different  simple  codes.  The  purpose 

»l  MHital  and  P»rtkoloei**t  Exisuaint  Unit  Na  9.  CVrf  C*a4ri>u4*ri:  Sft  Ou- 
ftyf  Sbtpatrd,  Pti.  Cult  J  Titiun 

**  D*»r5ep*J  at  Mrdtcal  and  P»vcWoU>(*c»l  Exiauninf  Unit  N*_  9. 

“  DtTtUwd  at  Unliol  amt  I’trtHalojual  Euautin|  Uait  X*.  9.  CV»f  taalr^utan:  $|t 
Qatrnc*  Skcpfctrd,  U.  F raotu  A.  Wiaiara. 
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FIGURE  21.5 

SAMPLE  PROBLEM  OF  MAZE  TRACING  SPEED 

is  to  establish  sets  for  two  simple  codes.  In  port  III,  words  of  both 
codes  appear  randomly  in  the  same  message. 

(1)  Internal  characteristics. — The  coties  in  parts  I  and  II  are  ex¬ 
tremely  simple.  In  part  I,  the  last  three  letters  of  the  coded  word  must 
be  placed  in  front  of  the  first  two  letters.  Tl»us  tlie  coded  phrase  "alnav 
ftcra  11  she  ormnj  dsroa”  is  read  "naval  craft  shell  major  roads/*  In 
part  II,  the  last  two  letters  must  be  placed  in  front  of  the  first  three. 
Tlius,  "verco  ecpst  oj»csl  eadah"  becomes  "cover  sleep  slope  ahead/*  In 
part  III,  the  words  in  a  message  must  l>e  solved  by  both  codes.  Thus,  in 
the  code  "kstan  inlxrg  ougal  perup  gerid”  the  1st,  2d,  and  5th  words  are 
decoded  by  the  method  of  part  I,  and  the  3rd  and  4th  words  by  the 
method  of  part  II. 

(2)  .Id ministration. — One  and  one-half  minutes  are  allowed  for  each 
part. 

(3)  Scoring. — The  change-of-set  score  is  the  difference  between  the 
number  of  words  correctly  solved  in  part  III  versus  parts  I  and  II. 

Reversed  Clock  Test  (no  code  number)" 

Descr  iption. — This  test  consists  of  5  jetrls,  each  part  having  20  items 

*  Dtrrloprd  *1  Mnlxil  »r»d  PmtohfKlI  Euoitiii|  Unit  S*.  S.  CkUf  WHiWStll 
T«fc./Sj».  XtWtt  W.  DitOck,  Set. 
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in  the  form  of  replicas  of  cluck  faces  with  different  times  indicated  on 
fiicli  clock  fan-.  In  parts  I  and  II,  flu  -ul  j<  «  t  i«nd-  tin-  »iin«*  in  a  normal 
manner.  In  pari  III  flic  1 1:1 1 1« Is  of  flic  clock  arc  reversed;  i.  «*..  the  longer 
liand  represents  the  hours,  and  the  sliort<  r  hand  repri  suits  the  inintifcs. 
In  part  IV  the  figures  on  the  clock  fare-  arc  placed  in  counter-clockwise 
rotation,  and  the  hands  move  in  a  counter-clockwise  direction,  the  short 
hand  indicating  the  hour  and  the  long  hand  indicating  the  iiiiimf «-<.  In 
pj.rt  V  the  numbers  of  the  clock  are  in  counUr-clockwise  order,  and  the 
hands  move  in  a  counter-clockwise  direction;  also,  the  longer  hand  l>c- 
comes  11k*  hour  hand,  and  the  shorter  hand  becomes  the  minute  hand.  In 
each  part  of  the  test,  the  examinee  has  a  specified  length  of  time  to 
finish.  Figure  21.6  illustrates  an  item  of  this  test. 


B  3  35 
C  7  05 
0  1235 
E  5  00 


B  1134 
C  1219 
D  6  45 
E  3  02 


B  7  *24 
C  4:36 
0  7  22 
E  1020 


B  !2:40 
C  9:14 
0  1235 
E  6  20 


FIGURE  21.6 

SAMPLE  ITEMS  OF  REVERSED  CLOCK,  AND 
READING,  CP  52  7  A 


CLOCK 


(1)  Administration. — One  minute  and  15  seem.  ’  are  allowed  for  each 

]nrt. 

(2)  Scoring. — The  changc-of  set  score  is  the  difference  in  numlicr  of 
problems  answered  correctly  using  part  II  as  a  base. 

Figure  Similarity  Test  (no  code  luinilrcr)1 2* 

The  examinee's  task  in  this  change -of--ct  test  is  to  indicate  the  '  ixi- 
iiiutit  mimlxr  of  figures  which  are  alike  according  to  certain  principles. 

Description.  —Mach  of  the  lir>t  10  problems  has  2  figures  .alike  in 
shrilling.  This  is  to  establish  a  set  for  similarity  of  shading  in  the  7  fig¬ 
ures  which  comprise*  each  item.  K.ach  of  the  following  10  problems  has 
3  f’gures  alike,  in  that  they  are  of  the  same  sh.a|>e,  and  also  2  figures 
alike,  in  that  they  are  of  the  same  shading.  The  following  10  problems 
have  4  figures  alike,  in  that  they  arc  of  the  same  shajx-  and  area,  3  fig¬ 
ures  alike,  in  that  they  are  of  the  same  shajx-  and  area,  3  figures  alike, 
in  that  they  are  of  the  same  shajx-,  and  2  figures  alike,  in  that  they  arc 
of  the  same  shading.  Figure  21.7  illustrates  a  problem  of  this  test.  In 

**  OrtrloprJ  >1  Mnlitil  lad  P»jr<bolo£>r»t  Flamming  Unit  No.  9.  Chirf  lUltort  S*t- 
Jimti  (i.  Madden,  T«kvSjt.  Ilimi  W.  Robciu,  and  1*»|.  UoniU  &.  Wkittcw. 
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FIGURE  21.7 

SAMPLE  PROBLEM  OF  FIGURE  SIMILARITY 


this  simple  item,  two  figures.  1\  ;im1  h\  are  similar  in  shape:  and  three 
figures.  A.  C\  aiul  (I.  are  alike  in  shading.  Since  more  figures  arc  alike 
with  respect  to  shading  than  with  respect  to  shape,  the  correct  answer  is 
the  three  lieu  res  which  are  alike  in  shading. 

( 1 )  Administration. — Ten  minutes  are  allowed  for  the  30  items. 

(2)  Scoring. — The  score  is  the  number  right. 

.Minute  Difference*  Discrimination  Test  (no  code  number)" 

The  examinees’  task  in  this  change-uf-sel  test  is  to  indicate  which 
one.  if  any,  of  four  figures  is  larger  than  a  key  figure. 

Description  ~  In  the  fu  st  series  of  10  problems  designed  to  establish 


figure  2i  8 
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a  set,  all  of  the  figures  in  a  given  position  and  with  a  given  shading  arc 
larger  than  the  key  figure.  In  the  next  series  of  eight  problems,  none 
of  the  four  figures  is  larger  than  the  key  figure.  Then  follows  another 
scries  of  four  problems  in  which  a  set  is  established  for  another  position. 
This,  also,  is  followed  by  a  scries  of  eight  problems  in  which  none  of 
the  figures  is  larger  than  the  key  figure.  A  series  of  six  items,  then, 
establishes  a  set  for  a  third  position,  followed  by  a  series  of  eight  prob¬ 
lems  in  which  none  of  the  figures  is  larger  than  the  key  figure.  Figure 
21.8  shows  an  item  of  this  test.  The  score  is  the  number  right. 

IntcrcorreUitions  of  ehangc-of-set  tests. — The  changc-of-sel  battery 
was  administered  to  333  unclassified  aviation  students.  One  crucial  check 
upon  the  hy|>othcsis  underlying  all  these  tests  is  found  in  the  intercorre¬ 
lations  among  them.  If  there  is  a  common  underlying  trait  measured  by 
scores  on  these  tests,  there  should  be  coefficients  of  substantial  si/e  in  the 
correlation  matrix,  provided  each  score  is  reasonably  reliable.  The  inter- 
correlations  are  presented  in  table  21.16. 


Tails  21.16. —  Intercorrelations  of  the  (hange-of-set  battery  based  upon  a  sample 
of  333  unclassified  aviation  students 


Teat 

I 

2 

l 

4 

5 

6 

7 

• 

I  Arithmetic  fwoHlrm  eoKing 

0.11 

0.10 

0.06 

-0.02 

0.07 

0.14 

0.11 

2  Figure  aimilaritj  . 

1  o*  is 

... 

.07 

.06 

.0* 

.16 

.04 

.14 

1  Mate  tracing  . 

.10 

.07 

.00 

-.01 

.17 

-.02 

.02 

4  Arithmetic  apeed  I  . 

.04 

.06 

.00 

.04 

.02 

.04 

-.01 

5  Arithmetic  »peed  II  . I 

-.02  1 

.08 

-.01  I 

.04 

.  „  . 

.12 

.0! 

-.05 

4  Referred  flock  . 

.07 

.16 

.17 

.02 

.12 

... 

-.01 

.06 

7  Code  decipherinj  . 

.14 

.04 

-.02 

.04 

.01 

-.01 

•  a  • 

.06 

•  St  mute  difference  . 

.11 

.14 

.02 

-.01 

-.05 

.06 

.06 

... 

It  may  lx*  seen  from  the  table  of  intercorrelations  that  there  is  no 
single  general  change-of-sct  factor  common  to  these  tests.  The  low  inter- 
correlations  may  jK>ssiIily  Ik*  attributed  to  low  test  reliability,  but  a  more 
plausible  hyj*othesis  is  that  they  are  due  in  large  part  to  specificity  of 
function. 

Clock  Rending  Test,  CP527.V  '* 

This  test  was  designed  for  the  pur|*»so  of  measuring  the  function  or 
functions  involved  in  rc*|x)nding  to  a  constantly  altered  or  rear  nged, 
but  essentially  familiar,  visual  stimulus-pattern.  Pilots  and  bombardiers 
arc  often  confronted  by  unexjxvted  changes  in  slimuhis-jinttcrns,  which, 
all-in-all,  are  familiar  to  them.  It  is  anticipated,  inasmuch  as  ready  ad¬ 
justment  to  changes  in  essentially  familiar  situations  results  in  an  imme¬ 
diate  and  natural  lessening  of  tension,  tint  the  inability  to  make  such 
adjustments  might  well  l<  prognostic  of  susceptibility  to  combat  fatigue 
and,  to  a  lesser  extent,  of  prcdis|*>sitjnn  to  combat  neurosis. 

Description.  (1)  fnlern.it  cfuir.u  lcrisli.s.-  -  The  test  has  two  parts. 
Part  1  consists  of  24  diagrams  of  a  coineiitiou.il  clock  face.  Part  II  con¬ 
sists  of  96  diagrams  of  clock  faces  presenting  16  dilTcrcut  variations. 


Dtrei.ped  at  P,>ck*JoiK»l  Rrtcarck  Uuit  NV  I.  Chief  tuilrikulw:  la.  V.  E.  Fiaker. 
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The  variations  are  introduced  by  counter-clockwise  arrangement  of  the 
hour-numbers,  by  rotation  of  the  hour-numbers  90°,  180°,  or  270°,  and 
by  reversing  the  meaning  of  the  hands.  The  examinee’s  task  is  to  tell 
time  under  these  varying  conditions.  The  test,  therefore,  is  very  simitar 
to  the  reversed  clock  test,  discussed  above.  Sample  items  are  illustrated 
in  figure  21.6. 

(2)  Administration. — The  time  limits  arc  2  minutes  and  IS  seconds 
for  part  I,  and  12^  minutes  for  part  If. 

(3)  Scoring. — The  scoring  formula  is  R— W/2. 

Statistical  results.  (1)  Distribution  statistics. — Typical  examples  of 
the  distribution  statistics  obtained  on  this  test  are  given  in  table  21.17. 


Table  21.17.—  Distribution  constants  for  the  Clock  Reading  Test,  CP527A,  based 

upon  a  sample  of  784  pilot x* 


Score 

1C 

SD 

Part  I  . 

15.0 

4.1 

I’art  H  . 

tQii  .1 

«s.6 

17.* 

(2  litter-fart  correlation. — The  correlation  of  part  I  with  part  II  gives 
a  product-moment  correlation  coefficient  of  0.39.  In  view  of  the  difference 
in  content  of  the  two  parts,  this  cannot  Ik*  considered  as  an  estimate  of 
its  reliability. 

(3)  Test  validity. — Validation  results  based  on  a  sample  of  pilots  are 
given  in  table  21.1& 


Table  21.18. —  Validity  data  for  the  Clock  Reading  Test,  CP527A,  based  upon 
graduation-elimination  of  a  sample  of  784  pilot J* 


Score 

“a 

M. 

SO, 

fHl 

Part  I  . 

15.0* 

14.52 

4.07 

0.01 

•.IS 

Tart  II  . 

46.6! 

42.01 

17.11 

.15 

.» 

'  TcMfd  in  June  1944  at  Psychological  Research  Unit  No.  I. 

1  Atsuminf  an  unrcatrictcd  itaninc  standard  deviation  of  2.00. 


Evaluation. — Combat  criteria  are  not  available  to  check  the  hypothesis 
uj»on  which  this  test  is  based.  It  docs,  however,  have  low  to  moderate 
validity  for  the  pilot-training  criterion.  This  validity  may  well  Ik*  due  in 
part  to  visualization  content,  but  whether  any  new  valid  factor  is  in¬ 
volved,  wc  do  not  know.  It  is  most  interesting  to  note  the  possibility  that 
a  test  of  the  simple  ability  to  read  clock-faces  fpart  I)  may  be  valid  for 
pilots.  This  may  well  be  due  to  jxrrceptual-spced  variance  in  the  test. 


EVALUATION  OF  TESTS  OF  SET  AM)  ATTENTION 

The  area  of  attention  and  set  was  explored  by  means  of  printed  tests. 
The  studies,  liowevcr,  were  tew  in  num!>cr,  and  no  concentrated  effort 
was  made  in  this  area.  Attention  tests  were  found  to  liavc  low  validity 
for  predicting  success  in  pilot  training.  On  the  other  band,  they  sevm  to 
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involve  intelKi  tn.il  tasks  associated  with  success  in  navigator  training. 

The  liyjHjtlusis  that  change  of  set  is  a  fimrlamental  trait  that  can  be 

measured  by  means  of  a  battery  of  tests  was  not  proved  to  be  justified 

bv  the  results  acliieved. 

* 

The  Following  Directions  Tot.  C I ’402A ,  the  only  test  in  the  group  to 
be  submitted  t  -  factor  analysis,  defined  a  factor  whose  identity  is  not 
established.  Present  opinion  is  inclined  to  identify  it  as  an  integration 
factor,  but  ibis  name  is  offered  only  as  a  temporary  expedient. 
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chapter  ir/Eim-m 


Introduction  tn  Temperament  Tests1 


GENERAL  CONCEPTION'S 

Having  accounted  Tor  tests  in  the  intellectual  and  perceptual  areas  in 
the  two  preceding  sections  of  this  volume,  an  1  having  eliminated  tests 
of  motor  functions  from  consideration,  \vc  have  remaining  an  area  of 
psychological  measurement  variously  known  as  emotion,  temperament,  or 
personality.  It  is  to  a  general  orientation  in  this  field  tliat  the  present 
chapter  is  devoted.  Owing  to  the  comparative  amorphous  character  of 
this  field,  and  to  the  dangers  of  confused  thinking,  a  somewhat  greater 
space  is  given  here  to  a  general  rationale.  There  will  also  be  a  discussion 
relating  temperament  to  air-crew  requirements  and  of  types  of  tests  de¬ 
signed  to  help  meet  the  demands  for  measurement. 

Definition  of  Terms 

There  will  be  no  attempt  to  develop  and  to  defend  a  complete  system 
of  psychology  of  personality.  In  order  that  the  organization  and  content 
of  what  is  to  follow  may  lx*  clearly  understood,  however,  it  is  necessary 
to  set  forth,  within  the  space  of  a  few  pages,  the  outline  of  a  conception 
of  personality.  This  can  be  accomplished  by  means  of  a  few  definitions 
and  categorical  statements  which  need  not  be  fully  accepted  by  the  reader, 
„.:t  which  should  serve  as  a  frame  of  reference  for  common  understand¬ 
ing.  The  decision  to  be  brief  is  at  the  risk  of  seeming  superficial,  but  the 
choice  is  forced  upon  us.  It  must  !>c  kept  in  mind  in  what  follows  that 
we  arc  merely  concerned  with  a  rational  basis  for  quantitative  descrip¬ 
tions  of  persons.  There  arc  other  ways  uf  dealing  with  persons,  calling 
for  other  types  of  rationale.  Personality  is  an  ambiguous  phenomenon, 
approachable  from  different  points  of  view. 

It  should  be  added  that  the  |>oint  of  view  expressed  here  is  not  by  any 
means  one  adopted  by  the  AAF  Aviation  Psychology  Program.  It  is 
believed  to  be  adequate  as  a  framework  for  presentation  of  developments 
arising  from  diverse  theoretical  backgrounds.  It  can  Ik  defended  on  the 
grounds  of  convenience  lor  those  whose  interests  are  in  the  development 
of  temjx'rament  tests. 

l\-rjj:uility.  —  People  are  |>ersons  lx*cause  each  one  is  unique.  An  indi- 
i  nitial  is  unique  because*  Ins  pattern  of  trails  or  characteristics  is  differ¬ 
ent  from  those  of  all  other  individuals.  The  term  "pattern"  is  used  dr- 

1  Writlca  Sr  tk*  Mitw. 
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lihcratelv  in  preference  to  the  alternative  of  “collection”  or  any  other 
term  implying  simple  aggregation.  An  individual’s  personality ,  then,  is 
his  unique  pattern  cf  traits. 

Traits. — The  key  to  personality,  thus  conceived,  is  the  general  phe¬ 
nomenon  of  individual  differences.  W  hen  \ve  ask  how  one  individual  is 
different  from  another,  we  must  get  down  to  particulars.  Comparison  of 
one  pcrson-as-a-wholc  with  another  person-as-a-whole  is  futile  and 
meaningless.  Human  comprehension  fails  to  grasp  such  totalities.  Obser¬ 
vations  and  conceptions  require  abstractions,  since  human  minds  are 
limited  in  their  spans  of  apprehension.  It  is  not  only  a  convenience  but 
also  a  necessity  for  us  to  compare  individuals  in  one  aspect  at  a  time. 
The  process  of  abstraction,  which  is  required,  entails  the  observation 
and  conception  of  individuals  in  a  manner  that  admittedly  can  be  called 
piecemeal.  Analysis  is  essential  whenever  observations  are  made.  Two 
individual),  A  and  II,  are  different  with  respect  lo  aspects  c,  d,  e,  f, 

. q  ;  individuals  B  and  C  differ  in  aspects  e,  f,  g,  .....  t ;  and  so 

on.  Any  distinguishable  way  in  which  one  person  differs  from  another 
is  a  trait.  The  term  “trait"  has  thus  a  very  general  extension.  Trait  names 
are  legion  and  they  vastly  enrich  our  language  describing  people.  Traits 
may  be  in  the  nature  of  common-sense  qualities  or  they  may  acquire  the 
dignity  of  scientific  sanction  and  use. 

Traits  differ  in  many  ways — in  scalability ;  in  universality;  in  general¬ 
ity  (v.  specificity)  ;  in  consistency  or  reliability;  in  flexibility  (v.  fixity)  ; 
in  jHilarity;  and  in  independence  (v.  dependence). 

Some  traits  are  scalable,  i.  c.,  each  capable  of  representation  by  means 
of  a  straight  line,  and  some  are  not.  Some  traits  are  either  present  or 
absent,  e.  g.,  complexes.  Others  are  present  to  different  degrees  in  differ¬ 
ent  degrees  in  different  persons.  The  latter  are  scalable.  It  is  in  this  type 
that  the  psychometrist  is  primarily  interested. 

The  universality  of  a  trait  refers  to  its  extensiveness  of  manifesta¬ 
tion  in  a  population  of  individuals.  Many  traits  are  of  such  common 
occurrence,  or  are  held  in  common  by  so  many  individuals,  that  most 
people  can  be  ranked  on  a  scale  of  more  or  less  of  those  qualities.  An 
example  is  the  degree  of  total  motor  activity  habitually  shown.  By  means 
of  an  appropriate  instrumein,  the  total  amount  of  muscular  energy  ex¬ 
pended  per  pound  of  body  weight  during  sampled  periods  of  time  might 
be  the  objective  and  accurate  means  of  placing  individuals  on  this  scale. 
A  trait  of  somewhat  less  common  extent  in  a  population  would  be  that  of 
marital  adjustment  (to  assume  a  very  abstract  variable)  which  could 
apply  only  to  those  who  had  opportunity  to  exhibit  behavior  describable 
as  good  or  poor  marital  adjustment.  A  still  more  highly  restricted  trait 
would  be  addiction  to  tics,  since  only  a  small  proportion  of  the  popula¬ 
tion  would  presumably  have  ties  at  all. 

Ibe  generality  of  a  trait  refers  to  its  extensiveness  within  the  indi¬ 
vidual.  Some  traits  are  so  general  that  they  pervade  almost  all  the  actions 
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0t  the  person  and  are  apparent  in  almost  any  kind  of  action.  Such  traits 
arc  nervousness,  meticulousness,  and  tempo  of  response.  Some  examples 
ot  more  restricted  generality  are  lionesty,  punctuality,  and  endurance. 
These  are  restricted  traits  because  they  appear  in  special  kinds  of  situa¬ 
tions  only,  and  because  they  represent  deviations  from  special  standards 
of  behavior  in  limited  areas.  Even  more  specific  traits  arc  platform  shy¬ 
ness,  liking  for  beer,  and  dread  of  cats  or  of  some  particular  cat. 

By  consistency  is  meant  the  uniformity  with  which  the  same  trait  (or 
trait-scale  position)  is  exhibited  in  repeated  (or  similar)  situations. 
When  an  individual  is  assigned  a  particular  scale  position  on  any  trait 
variable,  we  can  only  mean  that  this  is  his  characteristic  scale  location  (in 
other  words,  his  mode,  median,  or  mean  position).  His  fluctuation  about 
this  point  may  be  great ,  that  of  another  person  may  be  small.  The  latter 
individual  is,  of  course,  more  predictable  on  this  trait.  Some  individuals 
are  perennially  cheerful,  others  are  constantly  depressed,  and  others  run 
the  gamut  of  the  cheerfulness-depression  scale  from  time  to  time.  In 
some  traits  the  population  of  individuals  can  lx:  reliably  ratiked  because 
few  shifts  in  rank  occur  from  time  to  time,  and  in  other  trails  the 
fluidity  of  rank  positions  is  such  that  reliable  measurements  are  almost 
hopeless  and  predictions  are  practically  futile.  Distinction  must  be  made 
here  between  actual  shifts  of  position  and  changes  in  manifestation  of 
the  trait.  Many  alterations  in  an  individual’s  traits  are  more  apparent 
than  real. 

The  flexibility  of  a  trait  refers  to  its  being  subject  to  modifications  by 
learning — in  other  words,  its  docility  or  trainability.  This  might  be  re¬ 
garded  as  merely  one  condition  for  unreliability,  but  it  is  a  socially  im¬ 
portant  one  because  it  is  identifiable  and  because  training  is  largely  con- 
'rotlablc.  It  has  no  unusual  implications  for  measurement.  Its  effect  is 
similar  to  that  of  any  other  constant  error. 

The  polarity  of  a  trait  refers  to  whether  it  is  unipolar  or  bipolar.  Most 
ability  traits  extend  logically  from  a  zero  point  (complete  lack  ot  the 
ability)  to  the  maximum  amount.  Most  temperament  traits  are  bipolar, 
each  extreme  of  the  scale  being  given  a  name  of  some  quality  and  the 
two  qualities  being  opposites.  A  bipolar  scale  extends  through  an  in¬ 
difference  point  or  zero  point  near  the  middle  of  the  scale. 

The  independence  of  traits  is  a  quite  important  matter  in  the  descrip¬ 
tion  of  individuals.  Trails,  as  commonly  abstracted  and  named,  exhibit 
various  degrees  of  interrelationship,  as  noted  by  direct  observation  and 
more  clearly  by  means  of  intercorrelation  procedures.  In  connection  with 
the  goal  of  trait  measurement,  it  becomes  very  desirable,  from  the  stand¬ 
point  of  economy  and  rationality,  to  discover  what  the  independent  or 
near  independent  variables  of  personality  are  and  to  measure  them  sep¬ 
arately.  In  this  manner,  and  only  so,  can  maximum,  economical,  and 
meaningful  coverage  be  assured.  Having  made  piecemeal,  but  accurate, 
evaluations,  we  find  a  knowledge  of  the  interrelationships  useful  in  re- 
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constructing  the  totality  which  is  the  individual.  This  brings  us  to  an 
examination  of  the  structure  of  personality. 

The  Structure  of  Personality 

In  this  chapter  personality  has  been  defined  so  broadly  as  to  encom¬ 
pass  all  individual  differences.  Personality  would  therefore  include  mor¬ 
phological  and  physiological  as  well  as  psychological  traits.  The  morpho¬ 
logical  and  physiological  traits  arc  of  no  concern  to  us  here,  though  they 
contribute  definitely  to  making  individuals  unique  in  appearance  and 
health. 

The  psychological  traits  are  of  various  kinds.  One  large  group — men¬ 
tal  abilities — may  be  subdivided,  as  they  have  been  in  this  volume,  into 
intellectual  and  perceptual  categories.  The  division  line  is  admittedly  not 
completely  clear.  All  other  psychological  traits  may  be  arbitrarily  in¬ 
cluded  under  the  heading  of  temperament.  Tradition  would  designate 
this  area  alone  as  personality,  using  that  term  in  the  more  restricted 
sense.  It  is  believed  that  the  break  with  tradition  on  this  point  is  in  the 
interests  of  greater  logical  clarity.  Temperament  represents  a  somewhajj 
amorphous  group  of  traits  having  at  least  the  one  clement  of  emotion! 
ality  in  common.  It  is  becoming  recognized  that  a  very  important  aspect 
of  temperament  is  that  of  motivation.  Motivation  traits  include  interests 
and  attitudes,  areas  that  have  gained  attention  in  vocational  and  social 
psychology  in  recent  years,  and  have  yielded  to  attempts  to  quantify 
them.  There  arc  probably  many  who  would  regard  interests  and  attitudes 
as  coordinate  with  tcmperamerL  The  decision  on  this  question  is  arbi¬ 
trary  and  will  not  be  argued  here.  The  desire  for  a  single  word  other  than 
“personality"  to  encompass  the  nonability  traits  was  the  deciding  factor. 
A  better  term  than  “temperament”  is  needed. 

\\  ithin  the  individual,  the  structure  of  personality  deserves  mention 
from  a  somewhat  diflerent  point  of  view,  namely,  the  interrelationship  of 
traits.  For  one  thing,  intellectual  and  tcmj>eramcntal  traits  do  not  exist 
separately  in  the  behavior  of  the  individual.  Almost  any  active  behavior 
has  both  its  intellectual  and  temperamental  aspects. 

l  or  another  thing,  some  traits  seem  to  ho  organized  in  systems  or  hier¬ 
archies.  The  studies  of  the  areas  of  reasoning  and  memory  abilities,  as 
related  in  previous  chapters,  seem  to  show  that  there  are  widely  general¬ 
ized  factors  which  show  themselves  in  varieties  of  tests  that  have  funda¬ 
mental  operations  in  common,  and  there  are  also  factors  of  less  generality 
shi.rting  themselves  in  restricted  types  of  tasks  within  the  same  general 
area.  A  similar  situation  probably  prevails  to  an  even  greater  extent  for 
temjKTament  traits.  Individuals  may  differ  in  general  proneness  to  fear 
(general  timidity).  Somewhat  independently  they  may  differ  in  general 
fear  of  people  (general  shyness)  but  not  of  other  objects;  in  fear  of 
audiences  (platform  shyness);  and  in  fear  of  specific  persons  (for  ex¬ 
ample,  conditioned  icaction  to  a  dentist).  The  reason  for  this  is  that 
tenijR-ramcntal  traits  arise  to  a  large  extent  through  habit  formation  or 
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conditioning,  and  they  persist  according  to  the  laws  of  conditioning.  The 
laws  of  generalization  and  of  discrimination  determine  the  generality  or 
specificity  of  the  individual's  habits  and  hence  of  the  qualities  resulting 
from  them.  The  implication  from  this  is  that  any  attempt  to  predict 
behavior  of  a  certain  quality  should  take  into  consideration  the  generality 
of  the  trait  in  question.  Unless  the  generality  of  a  trait  extends  to  a  test 
situation,  it  goes  without  saying  that  the  trait  is  not  susceptible  of  meas¬ 
urement  by  tests.  Nor  are  we  interested  in  traits  that  do  not  show  suffi¬ 
cient  universality  in  the  population  concerned  when  selection  is  to  be 
made  from  that  population. 

TEMPERAMENT  AND  AIR  CREW  REQUIREMENTS 

In  selecting  candidates  with  suitable  temperament  for  air-crew  train¬ 
ing,  several  considerations  were  kept  in  mind.  The  selectees  must  be  men 
who  would  pass  all  the  hurdles  of  training,  who  would  successfully  carry 
the  fight  to  the  enemy  in  combat,  and  who  would  not  break  mentally  as 
a  consequence  of  combat.  They  must  be  highly  motivated  not  only  to 
become  skilled  performers  in  aviation,  but  also  to  engage  in  the  duties 
of  soldiers  at  the  front. 

Temperament  Traits  in  Training 

Job-analysis  findings  indicate  that  the  chief  aspects  of  temperament 
that  might  well  be  measured  are  tenseness,  nervousness,  emotional  con¬ 
trol,  absence  of  confusion,  self-confidence,  fear  and  apprehension,  inter¬ 
est  or  motivation,  leadership,  and  dependability.  Some  of  these  are  rela¬ 
tively  more  important  in  training  and  others  in  combat.  Without  any 
doubt,  emotional  traits  in  general  are  much  more  important  in  combat 
than  in  training.  This  is  partly  due  to  the  fact  that  combat  officers  face 
more  situations  that  place  a  premium  upon  desirable  temperament  traits. 
It  is  probably  due  to  some  extent  to  the  fact  that  the  classification  tests 
have  never  adequately  screened  out  men  with  weak  temperament  traits 
in  any  way  comparable  to  its  screening  on  aptitudes.  The  relative  dis¬ 
persions  of  aptitude  and  temperament  in  combat  air-crew  personnel  are 
therefore  quite  different,  that  for  temperament  being  relatively  greater 
and  therefore  more  noticeable. 

Reference  to  chapter  1  will  show  that,  in  training,  tenseness  or  nerv¬ 
ousness  ranked  high  for  bombardiers  (sixth  place  among  16  traits), 
but  only  moderately  for  the  navigator  (twelfth  place  among  29),  and  for 
the  pilot  (tenth  among  20).  Motivation  ranked  somewhat  lower  for  the 
bombardier  (eleventh  among  16)  and  very  low  in  the  list  for  the  navi¬ 
gator  (twenty-first  among  29),  and  for  pilots  in  flying  training  (eight¬ 
eenth  in  20).  Self-confidence  ranked  moderately  for  both  bombardiers 
and  navigators  and  was  not  mentioned  among  the  traits  for  pilots. 

Fear  of  flying  seemed  to  be  a  very  minor  cause  of  elimination  from 
training.  For  pilots  in  primary  training  the  data  based  upon  more  than 
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150.000  students  showed  that  only  1.5  percent  of  all  those  entering  fly- 
ing  training  were  eliminated  for  this  cause.  This  represented  ft  jx*rceni 
of  all  eliminecs.  It  is  jni->«.ib!r  that  fear  played  a  role  in  many  other 
eliminations  and  was  not  reported  as  such.  For  the  navigator,  the  trait 
of  carefulness  (or  what  may  even  be  called  mcticulousness)  seemed  to 
Ik*  unique,  and  it  was  given  very  high  rank  by  observers. 

Temperament  Traits  in  Combat 

In  combat,  emotional  control  is  rated  uniformly  high  (first  place  for 
bombardiers  and  navigators  and  second  place  for  pilots  among  the  twenty 
traits).  Motivation  is  given  high  rank  (third  from  the  top  for  fighter 
pilots  and  seventh  place  for  bomber  pilots)  for  the  pilot,  but  only  mod¬ 
erate  rank  for  the  bombardier  (twelfth  place)  and  for  the  navigator 
(thirteenth  place).  It  is  probable  that  many  a  pilot  was  enthusiastic 
about  his  training  due  to  its  civilian  potentialities,  but  was  not  properly 
motivated  to  fly  ships  in  combat. 

Dependability  ranked  second  for  both  bombardiers  and  navigators, 
but  is  in  eighth  place  for  pilots.  Leadership  is  rated  uniformly  slightly 
above  the  median  for  all  three  specialists.  This  uniformity  is  a  little  sur¬ 
prising  in  view  of  the  fact  that  the  pilot  is  most  i:cunlly  placed  in  posi¬ 
tions  of  leadership.  Perhaps  the  fact  that  all  three  arc  officers  and  all 
must  at  times  take  command,  leads  to  the  expectation  of  that  quality  in 
all  alike. 

TEST  APPROACHES  TO  TEMPERAMENT 
Difficulties  in  Teal  Development 

In  the  area  of  temperament,  test  development  was  difficult  from  sev¬ 
eral  points  of  view.  It  is  one  thing  to  recognize  important  traits  and  to 
have  estimates  of  their  relative  importance;  it  is  another  thing  to  devise 
tests  for  traits  that  are  not  often  sufficiently  generalized  as  to  appear  in 
test  situations.  Even  if  they  have  that  much  extension,  the  control  of  the 
test  situation  is  often  so  difficult  that  the  variable  wc  desire  to  measure 
is  overwhelmed  by  other  irrelevant  variances.  It  is  still  another  tiling  to 
find  suitable  job  criteria  to  serve  for  validation  purposes.  This  is  even 
more  difficult  than  in  the  case  of  ability  tests.  As  was  said  before,  tem- 
jK-rament  traits  are  much  more  imj>orlnnt  in  combat  than  in  training. 

Comliat  criteria  arc  more  difficult  to  obtain  and  are  unsatisfactory  at 
best.  In  either  training  or  in  combat,  the  relative  variance  to  be  attributed 
to  teiii|>eranieiit  traits  is  probably  quite  small.  Emotional  failures  are 
spectacular  when  they  do  occur,  but  their  occurrence  in  measurcablc  de¬ 
gree  is  limited  (for  example,  the  1.5  percent  of  fear  eliminations  in  pri¬ 
mary  pilot  training).  When  temjR-rament  tests  themselves  have  low  vari¬ 
ances  in  particular  factors  that  are  valid,  extremely  low  validity  coeffi¬ 
cients  must  be  cxjiectcd,  and  extremely  large  samples  arc  needed  to 
demonstrate  genuine  validity. 
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Another  practical  difficulty  was  the  very  great  attrition  rate  between 
experimental  testing  at  the  time  of  classification  and  the  completion  of 
a  tour  of  combat  duty.  Of  IO.(XHt  students  tested,  very  few  would  end  tip 
in  homogeneous  groups  with  rcsjK'ct  to  combat  criteria.  Many  would  be 
lost  to  air-crew  training  at  the  time  of  classification.  Consider  from  this 
point  on  only  the  pilots.  Through  advanced  training  many  would  he 
eliminated.  Of  the  graduates,  some  become  combat  fliers  and  some  not. 
Of  the  combat  pilots,  some  fly  fighter  planes  and  some  boml>er  planes.  In 
either  group  some  went  to  the  European  theater  and  some  to  the  Pacific. 
Some  completed  their  missions  under  one  set  of  conditions  and  some 
under  another.  Validation  within  any  homogeneous  combat  group  is 
therefore  well-nigh  impossible  if  the  usual  standards  arc  demanded.  Add 
to  thi^  the  unsatisfactory  status  of  record  keeping  in  combat  areas,  and 
the  dark  picture  for  combat  validation  is  complete. 

Some  Principles  of  Test  Development 

The  types  of  tests  needed  to  meet  the  requirements  briefly  envisaged 
above  had  to  be  selected  with  several  considerations  in  mind.  Because  the 
tests  would  be  used  to  qualify  or  disqualify  students  and  to  assist  in 
establishing  their  classification,  all  possible  criticisms  as  to  their  fairness, 
safety,  and  objectivity  had  to  be  forestalled.  Controls  had  to  be  so  ade¬ 
quate  that  the  examinees  could  not  change  the  fundamental  character  of 
the  test  by  failing  to  undertake  the  task  as  the  examiner  had  intended. 
The  right  type  of  motivation  and  set  had  to  l>e  aroused  in  examinees. 
To  meet  these  requirements,  a  number  of  principles  were  generally  ob¬ 
served,  although  explorations  over  a  wide  range  of  j>ossihilities  forced 
violations  of  those  principles  at  times. 

Consideration  for  the;  examinee. — The  question  of  safety  never  arose 
in  connection  with  printed  tests,  but  was  often  a  factor  in  performance 
tests — such  as  looping  chairs,  falling  hammers,  tests  of  endurance,  depri¬ 
vation  of  air,  etc.  It  was  difficult  to  frighten  or  to  tire  or  to  otherwise 
subject  the  examinee  to  a  severe  emotion-inducing  situation  without  risk 
of  injury  or  of  opening  the  psychologists  to  the  charge  of  injury  even 
when  none  occurred.  Even  “razzing”  or  insulting  or  annoying  exami¬ 
nees  was  rarely  attempted  because  of  the  risk  of  destroying  rapjtort  in 
other  tests  and  because  such  practices  arc  generally  repugnant  in  a  dem¬ 
ocratic  society.  These  facts  are  mentioned  merely  to  show  tliat  the  pro¬ 
gram  was  more  or  less  forced  back  ui>on  printed  tests  for  its  main  type 
of  instrument  in  temperament  testing. 

In  all  types  of  temperament  tests,  it  is  usually  important  to  conceal 
from  the  examinee  the  genuine  intentions  of  the  tests.  In  attempting  to 
accomplish  this  end,  great  dependence  is  placed  upon  the  wording  of  the 
instructions  and  upon  the  disguised  apjxarance  of  test  content.  It  was 
one  principle  of  the  program  to  avoid  dcliWrate  falsehoods  in  giving 
instructions  to  examinees.  This  naturally  had  its  restricting  influence. 
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T>|»<  *  of  Ti'm[itT:iiin  iil  Tests 

<  blur  priiuij.il  .  i4. - t-r \ ■  <  1  ran  well  be  jv/mlnl  out  in  connection  with 
.ni  account  of  tli fit  h  i.!  !  .;•<  >  of  tc;-t>  developed  or  tried  out  for  air  crew. 

f'rrsi'Hulifv  iui'i'ii.'<‘r  Fn-m  the  early  days  of  the  program,  consid¬ 
erable  di-tre-t  w.e.  .-..nceming  personality  inventories  or  ques¬ 

tionnaires.  Thn  wa ,  c!u.-:iy  <  n  the  ground  that  intelligent  examinees  can 
make  more  favorable  •(...•res  for  themselves  than  arc  justified,  by  falsify¬ 
ing  their  replies  to  •;  a  •  Tin-  extent  to  which  this  is  true  has  to  he 
demonstrated  for  >acli  te  r  and  each  item.  It  is  possible  that  some  tests 
of  this  sort  have  been  so  deviled,  or  new  ones  could  be  so  devised,  that 
the  benefits  from  falsification  could  be  reduced  to  a  minimum.  Even  if 
this  were  true,  however,  it  was  felt  to  he  undesirable  for  any  examinee 
to  leave  the  tests  with  the  conviction  that  lie  had  been  able  to  outwit  the 
psychologist  and  to  add  to  his  chances  of  qualification  by  willfully  giving 
erroneous  answers. 

The  fact  that  the  Hernreutcr  Personality  Inventory  had  previously 
failed  to  predict  training  success  in  either  CAA  studies  or  in  the  Army 
Air  Corps  also  lent  a  restraining  hand.  The  apparent  success  of  the 
Shipley  Personal  Inventory  in  the  armed  services  as  the  war  pro¬ 
gressed,  however,  and  the  availability  of  the  experimental  testing  time 
were  favorable  to  the  administration  of  a  number  of  commercial  per¬ 
sonality  inventories  for  validation  against  pilot  training  so  that  extensive 
knowledge  of  the  true  status  of  usefulness  of  inventories  could  be  ob¬ 
tained.  The  studies  of  these  tests  were  planned  in  such  a  manner  that  if 
the  questionnaire  type  of  item  is  useful  at  all  in  prediction,  a  large  pool 
of  valid  items  could  he  selected.  Other  well-known  commercial  tests  that 
were  included  along  with  the  inventories  were  the  Strong  Vocational 
Interest  Plank  and  the  Kuder  Preference  Record. 

Genera!  Information  tests. — Other  types  of  printed  temperament  tests 
developed  in  the  Army  Air  Corps  were  varied  in  nature.  One  type  which 
was  capitalized  upon  extensively  is  the  information  test.  It  was  seen  in 
chapter  14  how  it  is  pn--sii>!e  to  assess  pilot  interest  indiu'clly  through 
what  the  individual  knows  (or  does  not  know).  The  same  device  has 
been  tried  in  connection  with  other  qualities,  such  as  masculinity  and 
leadership.  The  information  tests  have  a  bona  fide  appearance  to  the 
examinee.  The  use  of  information  tests  for  assessing  temperamental 
traits,  however,  calls  for  weighting  responses  empirically,  positively  or 
negatively,  as  they  correlate  with  achievement  criteria.  Occasionally,  ex¬ 
aminees  have  expressed  wonderment  at  the  inclusion  of  certain  items. 
(One  might  well  query  why  a  good  pilot  should  know  the  correct  answer 

to  the  question,  "An  arpeggio  is . ")  It  is  desirable  to  keep  such 

items  to  a  minimum  even  though  they  may  prove  to  be  predictive.  An¬ 
other  qualification  !<>  mention  regarding  general-information  tests  is  that 
the  highly  educated  individual's  score  may  be  biased.  One  solution  to 
this  might  be  to  give  a  gcncral-vocalmlary  test  in  conjunction  with  the 
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information  test,  and  to  use  the  score  as  a  reference  variable  or  as  a 
suppression  variable. 

Biographical  data. — Biographical  data  tests  were  at  first  somewhat 
suspect,  along  with  personality  inventories.  Results  which  became  avail¬ 
able  during  1942,  however,  drastically  altered  the  evaluations  of  this 
form  of  test.  Item  validations  yielded  scoring  keys  with  quite  substantial 
unique  validity,  and  so  a  Biographical  Data  Blank  became  a  permanent 
part  of  the  classification  battery.  Early  Biographical  Data  Blanks  em¬ 
phasized  factual  information  regarding  previous  environment,  education, 
and  general  cxj>criencc.  It  was  demonstrated  empirically  that  in  spite  of 
the  opportunity  for  a  certain  amount  of  falsification  under  the  usual  test 
conditions,  the  validity  of  the  scores  held  up.  It  is  possible  that  with 
factual  information  upon  which  the  examinee  knows  there  can  be  a 
chcck-up  if  necessary,  a  much  higher  percentage  of  truthful  responses 
can  be  obtained.  More  recently  developed  experimental  forms  of  bio¬ 
graphical  data  have  resorted  to  less  factual  questions,  most  of  which 
still  lack  validation  studies. 

Projective  methods. — An  attempt  was  made  to  get  at  the  deeper  re¬ 
cesses  of  temperament  by  means  of  the  projective  techniques  of  which 
the  Rorschach  and  the  Thematic  Apperception  tests  are  familiar  exam¬ 
ples.  The  individual  administration  of  these  tests  as  in  clinical  practice 
was  recognized  as  being  out  of  the  question  for  mass  testing  (some  ex¬ 
amining  units  tested  as  many  as  500  men  per  day  with  group  tests  and 
simultaneously  500  others  with  psychomotor  tests).  The  Rorschach  test 
was  given  a  thorough  and,  it  is  believed,  an  adequate  trial  in  its  approved 
clinical  form,  with  an  attempt  to  predict  success  in  pilot  training  from 
any  or  all  of  the  data  yielded  by  it.  Two  variations  adapted  for  group- 
administration  (Ilarrowcr-Erickson  and  an  improvised  form)  were  also 
tried  out.  The  Thematic  Apperception  test  of  Murray  was  given  in  mod¬ 
ified  form  for  validation  and  a  number  of  variations  of  the  thematic  ap¬ 
proach  were  developed  in  printed  form  for  group  administration.  From 
none  of  the  projective  techniques  used  were  any  promising  results  ob¬ 
tained. 

Observatioml  techniques. — Other  approaches,  hardly  to  be  classified 
as  printed  tests  but  fitting  no  other  category  in  this  scries  of  volumes 
any  belter,  were  observational  and  interview  techniques.  Observations 
were  made  of  examinees  during  the  administration  of  psychomotor  tests 
and  under  other  situations  when  they  could  be  observed  unobtrusively, 
and  ratings  of  various  traits  were  recorded  by  observers.  Personal  inter¬ 
views  of  a  substantial  length  were  also  held,  after  which  predictions  of 
probable  success  were  recorded.  These  procedures  yielded  only  the  mini¬ 
mum  acceptable  validities,  and  all  of  them  suffered  seriously  from  the 
necessary  factors  of  subjectivity  in  administration  and  interpretation. 

Motivation  tests. — A  final  group  of  measuring  devices  was  concerned 
with  interests  and  attitudes.  There  is  no  denying  the  importance  of  plac- 
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ing  a  man  in  training  which  he  will  accept  or  even  welcome  This  state¬ 
ment  has  two  facets.  On  the  one  hand,  does  the  job  provide  satisfaction 
of  the  man's  fundamental  interests?  On  the  other  hand,  is  it  one  that 
he  thinks  he  prefers?  The  degree  of  correlation  between  genuine  inter¬ 
ests  and  self -appreciated  interests  is  unknown.  It  probably  differs  for 
various  areas  of  occupations  or  even  for  specific  occupations.  In  the 
classification  of  aviation  trainees,  the  approach  was  from  both  aspects. 
It  was  believed  to  be  important  to  obtain  from  each  candidate  his  own 
statement  of  preference  and  to  follow  it  as  well  as  possible  in  classifica¬ 
tion,  even  though  a  man  might  not  know  bis  own  interests  well.  The  use 
of  this  datum  in  classification  was  weighted  far  in  excess  of  anything 
justified  by  validity  against  training  criteria.  Efforts  were  also  made  by 
means  of  various  types  of  printed  tests  to  assess  fundamental  interests 
and  morale  as  factors  for  success  in  specialized  training.  A  peculiar  in¬ 
stance  of  the  problem  of  interests  was  intensively  investigated  in  connec¬ 
tion  with  the  assignment  of  pilots  to  fighter  v.  bomber  training,  for  in 
this  respect  the  two  assignments  apparently  had  requirements  that  were 
significantly  different. 

The  Plan  of  Presentation  of  Tests 

The  organization  of  the  chapters  following  is  open  to  question  in  sev¬ 
eral  respects.  More  than  one  principle  of  grouping  exists,  and  few  could 
lie  followed  completely.  The  result  is  a  compromise  of  two  or  more 
principles. 

Chapter  23  begins  with  personality  inventories,  most  of  which  are 
commercial  tests.  Chapter  24  presents  clinical  techniques — projective  and 
observational  methods  and  variations  thereof.  Chapter  25  embraces  a 
miscellaneous  list  of  tests  each  designed  for  some  particular  trait.  Chap¬ 
ter  26  is  coherent  hv  reason  of  its  single  large  area  of  motivation,  in¬ 
cluding  interests,  morale,  and  attitudes.  Chapter  27  concerns  the  bio¬ 
graphical-data  approach.  On  the  whole,  reasonably  equitable  distribution 
of  effort  and  attention  is  evident  as  far  as  one  can  justifiably  proceed  at 
the  present  time  with  printed  tests. 

Tlh'  ctulhuj  of  trmf'crtiiih'nt  tests. — The  grouping  of  tests  in  these 
chapters  dues  not  fully  agree  with  the  code  numbering  of  the  tests.  Code 
numlters  were  assigned  according  to  the  following  groups: 

100 — Absence  of  tenseness. 

200 — Absence  of  nt rvou-ncss. 

J00  Absence  of  fear  ami  apprehension. 

400 — Temperament. 

500 — Motivation. 

600  —  Personal  information. 

700—  Projective  t.  i  lniiques. 

S00-  Fatigue. 
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Printed  tests  are  almost  entirely  confined  to  the  300  to  700  groups  in¬ 
clusive;  since  tests  of  tenseness,  confusion,  and  fatigue  are  not  well 
adapted  to  printed  form. 

Concluding  comment. — No  strikingly  original  form  of  temperament 
tests  was  developed  by  the  Army  Air  Forces,  though  a  number  of  inter¬ 
esting  innovations  were  tried.  Considerable  experience  was  gained  in  the 
use  of  traditional  forms  and  of  new  variations  of  the  same.  Perhaps  the 
most  profitable  finding  was  that  much  can  be  gained  by  the  adaptation 
of  general-information  tests  and  of  biographical-data  information  in  the 
prediction  of  success.  Such  internally  heterogeneous  tests  must  be  con¬ 
structed  to  fit  almost  every  type  of  job  empirically,  validating  each  item 
against  a  proficiency  criterion.  They  will  probably  be  found  ample,  thus 
developed,  in  a  great  many  spheres  of  vocational  selection  and  guidance. 
They  do  not  satisfy  the  psychometrist  who  likes  to  know  what  traits 
he  is  measuring  and  who  prefers  unique  tests. 
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Personality  Inventories1 


INTRODUCTION 

In  this  chapter,  and  the  next,  the  results  of  the  validation  of  many 
commercially  provided  instruments  for  the  evaluation  of  personality 
will  be  considered.  Inventories  and  questionnaires  are  considered  in  this 
chapter;  clinical-type  instruments  are  considered  in  the  next.  The  dis¬ 
tinction  between  these  two  types  of  instruments  is  thought  to  be  one  of  • 
considerable  importance.  In  the  clinical-type  evaluation,  the  examinee 
responds  to  a  relatively  unstructured  situation,  in  which  he  is  free  to 
give  responses  as  he  secs  fit.  In  the  personal  inventories,  the  approach  is 
deliberately  controlled  as  much  as  possible,  in  order  to  force  the  examinee 
to  react  in  one  specific  manner  or  another  so  as  to  obtain  the  best  possible 
estimate  of  individual  differences  on  some  trait  and  to  minimize  differ¬ 
ences  due  to  uncontrolled  factors. 

The  primary  responsibility  for  the  development  of  tests  of  emotion 
and  temperament  was  assigned  to  Psychological  Research  Unit  No.  1. 
In  addition  to  developing  many  special  tests  and  techniques  to  assess 
personality  characteristics  which  were  indicated  by  job  analyses  to  be 
important  to  air-crew  success,  it  was  decided  to  undertake  the  adminis¬ 
tration  and  validation  of  as  many  commercial  tests  as  time  and  circum¬ 
stances  permitted.  This  decision  was  Lnsed  primarily  upon  two  consid¬ 
erations  :  (1)  It  seemed  desirable  to  estimate  the  predictive  value  of 
various  commercial  tests,  already  constructed,  both  for  information  as 
to  the  actual  worth  of  the  instrument  in  predicting  success  in  air-crcw 
jKisitions,  and  to  obviate  any  possible  postwar  criticism  that  use  of  com¬ 
paratively  well  known  instruments  was  not  attempted;  and  (2)  it  was 
hoped  that  by  means  of  item-analysis  procedures  (see  ch.  3)  a  large  pool 
of  items  might  lie  built  up  from  various  tests  which  could  be  combined 
into  one  or  more  highly  predictive  tests  for  use  in  the  selection  program. 

The  criterion  utilized  almost  exclusively  for  the  validation  of  these 
personality  tests  was  that  of  graduation-elimination  from  primary  train¬ 
ing.  The  consensus  of  Hying  instructors  seemed  to  be  that  various  attri¬ 
butes  of  personality  arc  of  importance  in  successfully  completing  the 
various  stages  of  flying  training.  It  is  obvious,  however,  that  combat 
criteria  are  at  least  equally  desirable,  since  many  of  the  tests  purport  to 
assess  temperamental  variables  which  should  be  of  value  in  estimating 

*  Written  fc/  St*S/S*C  Aitkur  Z.  Ctti. 
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prcdisj/OMtion  to  com  it  neuroses.  Such  criteria,  however,  were  impos¬ 
sible  to  secure  for  file  pur|»os«\ 

Several  patterns  of  evaluation  have  been  employed.  Two  complete  pro¬ 
cedures  have  been  to  (  1 )  validate  the  scores  provided  by  already  ex¬ 
istent  keys  and  (2)  item-anahve  the  test  and  perform  a  cross-validation, 
employing  random  halves  of  an  entire  sample  of  papers.  For  many  tests, 
abbreviated  techniques  have  been  employed.  For  some,  item-analyses  of 
random  halves  of  the  entire  sample  have  been  made;  cross-validation'-1 
were  then  performed  only  if  the  analyses  indicated  a  sufficiently  large 
number  of  valid  items.  This  procedure  provides  an  indirect  estimate  also 
of  the  validity  of  the  keys  provided  by  the  test,  for,  if  the  items  for  a 
given  score  are  not  valid,  then  the  score  cannot  he  valid.  A  second  ab¬ 
breviated  procedure  lias  been  to  validate  the  scores  only. 

The  greater  number  of  the  tests  discussed  in  this  chapter  are  available 
in  commercial  form.  Three  of  the  tests  to  be  discussed  here,  however,  ' 
were  developed  in  the  AAF  Psychology  Program.  They  are  the  Re¬ 
stricted  Word  Association  Test  CK702B,  the  Information  Blank  (S-C) 
CE410A,  and  the  Teacher  Preference  Scale  CE426A. 

The  description  of  the  tests  will  be  in  two  sections;  the  first  will  be  ■ 
devoted  to  personal  inventories,  the  second,  to  preference  inventories. 

PERSONAL  INVENTORIES 

In  this  section  will  be  included  tests  that  sample  various  personality  $ 
areas.  Botli  normal  and  deviate  areas  are  explored  and  evaluated,  such  \ 
as  introversion-extroversion,  economic  status,  familial  reactivity,  and  ’ 
hyp  -idriacal  tendencies.  The  order  of  presentation  of  the  tests  is  { 
purely  arbitrary.  j 

In  general,  the  test  ’  ,veic  administered  from  commercial  booklets,  with 
resj>onses  being  recorded  on  IBM  answer  sheets,  suitable  for  machine  , 
scoring.  Each  test,  though  it  may  be  commercial,  has  been  assigned  a  i 
code  number  consistent  with  the  AAF  series  of  tests,  for  convenience.  ' 
All  were  given  with  time  limits  that  permitted  all  examinees,  or  nearly  , 
all,  to  resjH>ml  to  all  items. 

Information  Blank  S-C,  CE410A  * 

This  test  represents  one  of  the  earliest  studies  on  the  efficacy  of  tem¬ 
perament  questionnaires  as  predictive  instruments  in  the  AAF  Psychol¬ 
ogy  Program.  An  important  objective  in  preparing  and  administering 
this  test  was  to  study  the  feature  of  falsification  by  students  in  response 
to  questionnaire  items  when  the  test  is  taken  in  a  conqx.iiivc  situation. 

1  he  validation  of  the  scores  was  a  secondary  consideration. 

Description. — The  questionnaire  consists  of  60  items,  of  which  25 
arc  presumed  measures  of  the  trait  of  self-sufficiency  v.  sociability  (type 
S),  25  items  of  the  trait  self-confidence  (type  C),  and  10  items  of  truth* 

•Dtvtloptd  it  Psychological  Research  Unit  No.  1.  Chief  contributor:  Lt,  Col.  Laurance  F. 
Shuler, 
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fulness  (type  T).  The  50  items  other  than  truthfulness  were  adapted 
from  the  Brriirciilcr  Personality  Inventory  (CK133A)  with  (lie  aid  of 
Flanagan's  factor  analysis  (13).  The  10  truthfulness  items  were  written 
for  the  questionnaire  and  are  of  a  type  similar  to  tSiose  used  in  J.  N. 
Washburnc’s  “Social  Adjustment  Inventory.”  The  truthful  rcsjwnse  con¬ 
sists  of  admitting  that  one  doesn’t  always  hehave  in  a  particular  socially 
approved  fashion. 

(1)  Internal  characteristics. — The  items  arc  answered  either  "yes” 
or  "no.”  Samples  of  each  type  of  item  follow: 

Type  S :  Do  you  like  company  when  you  are  feeling  sad  ? 

Do  you  make  most  of-yotir  decisions  alone? 

Type  C:  Do  you  often  feel  just  miserable? 

Do  you  dislike  public  speaking? 

Type  T :  Did  you  ever  take  anything,  even  a  pin,  that  belonged  to 
someone  else? 

Did  you  ever  tell  what  was  not  quite  the  truth  in  order  to 
get  yourself  out  of  a  difficulty? 

(2)  Administration. — The  time  required  is  approximately  15  minutes. 
Pertinent  directions  are: 

In  this  blank  you  arc  asked  sonic  general  questions  about  the  way  you  think.  It  is 
not  a  test.  There  arc  no  right  answers  except  the  answers  that  tell  the  truth  about 
yourself. 

•  *  *  Work  rapidly.  Don't  think  a  long  time  about  each  question,  but  record 
your  first  judgment  promptly.  Answer  every  question.  Omit  none. 

(3)  Scoring. — The  scoring  was  accomplished  on  the  basis  of  the 
author's  a  priori  key.  High  numerical  scores  for  the  three  variables  indi¬ 
cate  self-sufficiency,  self-confidence,  and  truthfulness,  respectively. 

Statistical  results. — Data  are  available  for  a  sample  tested  in  March 
1942  at  Psychological  Research  Unit  No.  1. 

(1)  Distribution  statistics. — A  sample  of  200  unclassified  aviation 
students  yielded  the  distribution  data  presented  in  tabic  23.1. 


Tam  e  23.1. —  Distribution  data  for  the  Information  Blank  (S-C),  CE410A,  based 
on  a  sample  of  ZOO  unclassified  aviation  students 


Scale 

i  M 

1 

Median 

SD 

107 

10.2 

2.9 

C  . 

18.1 

18.6 

J.4 

T  . 

S.4 

S.S 

1.8 

(2)  Test  validity. — Two  sets  of  validation  data  are  available.  Biserial 
coefficients  of  correlation  between  test  score  and  graduation-elimination 
for  the  three  types  of  training  combined  (bombardier,  navigator,  and 
pilot)  are  presented  in  table  23.2.  Bombardiers  and  navigators  were  not 
considered  separately  because  of  the  small  number  of  diininccs,  3  and 
14  respectively.  The  data  for  the  validity  of  this  instrument  in  predicting 
success  in  pilot  primary  training  alone  are  presented  in  table  23,3. 
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Table  23.2.—  Validation  data  for  a  group  of  bombardiers,  navigators and  pilots 
using  a  graduation-elimination  criterion  for  the  Information  Blank  S-C,  CE410A * 


ScjW 

M. 

SD, 

fM» 

s . 

c  . 

10.83 

11.11 

2.95 

-0.05 

18.66 

18.21 

3.52 

.07 

f  . 

S.3S 

S.16 

1.90 

.05 

*  Criterion  tor  bombardier*  and  navigators  is  eraduation  elimination  from  advanced  training. 

*  Criterion  for  pilots  is  graduation-elimination  from  primary  training. 

•N,  =  27S.  /,=0  86. 


Table  23.3.—  Validation  data  for  a  group  of  pilots  in  primary  training, 
using  a  graduation-elimination  criterion  fot  the  Information  Blank  (S-C),  CE410A ' 


Seal* 

M. 

SD, 

fHl 

S  . . 

10.24 

11.29 

2.89 

-0.21 

C  . 

10.01 

17.95 

3.62 

.18 

T . 

$.18 

4.67 

1.93 

.16 

*N,=8J,  >,=0.75. 


For  the  combined  group  of  bombardiers,  navigators,  and  pilots  it 
would  require  a  biscrial  coefficient  of  0.18  for  significance  at  the  5  per¬ 
cent  level  and  one  of  0.24  for  significance  at  the  1  percent  level.  None  of 
the  biscrial  r’s  approaches  significance  at  the  5  percent  level. 

For  the  group  of  pilots  it  would  require  a  biserial  coefficient  of  0.29 
for  significance  at  the  5  percent  level  and  of  0.39  for  the  1  percent  level. 
Only  the  S  scale  begins  to  approach  significance  at  the  5  percent  level, 
and  this  is  an  inverse  relationship,  which  would  seem  to  indicate  that  a 
high  degree  of  self-sufficiency  is  negatively  associated  with  success  in 
primary  pilot  training.  There  was  no  significant  validity  found  for  the 
truthfulness  key. 

(3)  hilcrcorrclalions. — The  intcrcorrclations  are  r,c— 0.10,  r,t~—  0.06, 
and  rcl~—0.02,  based  on  200  unclassified  aviation  students.  On  this 
basis,  it  is  seen  that  there  is  essentially  no  coinnuinality  among  these 
variables. 

Evaluation. — Validity  data  obtained  for  this  test  indicate  that  it  does 
not  predict  graduation  or  elimination  from  primary  training.  None  of 
the  correlations  differs  significantly  from  zero. 

The  truthfulness  scores  of  the  group  were  generally  low.  Of  the  10 
truthfulness  items,  the  average  aviation  student  made  truthful  responses 
to  only  5.4  (sec  table  23.1).  This  indication  of  a  substantial  amount  of 
falsification  of  response  leads  to  the  test  author's  evaluation  of  this  in¬ 
strument.  “The  most  striking  conclusion  of  this  study  is  the  undepend¬ 
ability  of  the  truthfulness  of  simple  questionnaire  items,  administered  in 
e  highly  competitive  situation  of  the  aviation  cadet  classification  tests. 
The  study  throws  serious  doubt  on  the  possibility  of  using  questionnaires 
of  any  kind  for  the  classification  of  aviation  cadets.’’*  The  near-zero 
correlations  between  the  truthfulness  scores  and  the  other  two,  however, 
would  leave  some  doubt  concerning  the  generalizing  of  the  quotation  be¬ 
yond  the  truthfulness  scores  themselves. 
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The  Humm-Wadsworth  Temperament  Scale,  CE418A 

The  Humm-Wadsworth  Temperament  Scale  (10)  was  developed  as 
a  means  of  evaluating  the  temperamental  qualities  of  applicants  for  em¬ 
ployment.  Accordingly,  experimental  administration  of  this  instrument 
was  undertaken  at  Psychological  Research  Unit  No.  1  to  evaluate  its 
usefulness  in  the  aviation-cadet  selection  program. 

Description. — The  scale  is  based  upon  RossanofT’s  original  theory  of 
personality.  "It  represents  a  comparison  of  the  incidence  of  tempera¬ 
mental  traits  as  they  occur  in  combination  within  seven  groupings  or 
components.”  (12)  The  comparison  for  each  component  was  obtained 
by  contrasting  the  responses  of  two  groups  of  subjects  to  the  same 
test  questions.  One  group  included  individuals  whose  behavior  mani¬ 
fested  the  presence  of  stated  traits.  The  other  group  included  indi¬ 
viduals  whose  behavior  was  free  from  manifestations  of  the  compo¬ 
nent.  (10) 

A  brief  description  of  the  various  components,  as  given  by  the  au¬ 
thors  of  the  scale  (9),  follows: 

(a)  Normal,  (N),  is  primarily  a  control  mechanism,  providing  ra¬ 
tional  balance  and  temperamental  equilibrium.  It  underlies  the  conserva¬ 
tism  and  conformity  to  socially  acceptable  behavior  observed  in  the  well- 
adjusted  person. 

( b )  Hysteroid  (H),  considered  as  ethically  inferior  motivation; 
domination  by  considerations  of  selfish  personal  advantage;  irresponsi¬ 
bility  toward  the  social  communty. 

( c )  Cycloid  (C),  includes  emotionality  in  all  dimensions  (elevation, 
intensity,  volume) ;  variations  in  energy,  in  attention,  and  in  behavior 
reactions. 

(1)  Manic  (M),  is  manifested  by  cheerfulness,  activity,  alertness, 
versatility  of  interest,  sometimes  irritability, 

(2)  Depressive  (D),  components  arc  sadness,  inactivity,  sluggishness 
of  thought,  hopelessness. 

( d )  Schizoid  (S),  includes  imagination,  and  withdrawal  from  envi¬ 
ronment 

(1)  Autistic  (A),  seclusivencss,  inward  contemplation,  narrowed 
sphere  of  interests. 

(2)  Paranoid  (P),  self-sufficiency,  certainty  of  position,  militant  de¬ 
fense  of  ideas,  suspiciousness. 

( e )  Epileptoid  (E),  tendencies  toward  project-making  with  inspira¬ 
tion  toward  achievement.  (Some  cpilcptoids  have  lapses  of  consciousness 
or  other  epileptic  symptoms.) 

(1)  Internal  characteristics. — The  scale  consists  of  318  questions,  each 
requiring  a  yes  or  no  answer,  of  which  159  are  scored.  Examples  of 
typical  questions  for  several  of  the  components  arc  as  follows: 

'letter  by  Lt.  Col.  L.  F.  Shaffer,  Subject:  Exploratory  Study  of  the  Application  o t  Per¬ 
sonality  Quettionnaire*  to  Aviation  Cadeta,  dated  22  July  1942. 
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a  Have  your  activities  ever  been  interrupted  by  "blank"  or  unconscious  periods? 

I>,  Are  you  apt  not  to  give  your  opinion  at  a  meeting  unless  you  arc  asked  for  it, 
even  when  you  do  not  like  tbc  way  things  arc  going?  » 

c.  Do  you  often  have  to  "sleep  over”  a  matter  before  you  decide  what  to  do?  j 

Questions  arc  scored  for  one  or  more  of  the  components,  different  ; 
weights  being  assigned  to  the  various  components  for  different  items.  ) 

(2)  .Id  mini. that  ion. — lielwcen  30  to  90  minutes  arc  required  to  com-  * 

ph  te  the  items.  1 

The  directions  used  are  those  of  the  authors: 

This  set  of  questions  lias  to  do  with  the  way  you  think.  Read  each  question  and  ' 
answer  "yes"  or  "no."  Give  the  first  answer  that  occurs  to  you  in  each  case,  and  let  j 
it  stand. 

(3)  Scoring. — Raw  scores  were  obtained  for  each  category  of  tem¬ 

perament.  These  were  sent  to  Dr.  Ilunini  who  prepared  a  profile  and  a  ; 
description  of  the  temjieramental  pattern  and  who  provided  a  short  writ¬ 
ten  summary  which  contained,  by  implication,  a  prediction  of  success.4 
Definition  of  several  terms  seems  to  be  in  order.  j 

The  profile  score  for  a  category  is  the  raw  score  corrected  for  no-  \ 
count  and  scaled.  The  no-count  is  the  number  of  questions  answered  j 
“no”  by  the  examinee.  The  difference  score  is  the  profile  score  for  a  ■ 
category  subtracted  trom  the  profile  score  for  the  normal.  I 

The  weightings  for  the  various  raw-score  categories  arc  the  same  as  i 
those  used  in  the  commercial  form  of  the  test. 

Statistical  results. — Data  are  available  for  approximately  200  pilots  in 
classes  431,  43J,  and  43K. 

( 1 )  Reliability  coefficients. — Reliability  data  are  not  available  for  this 
sample.  In  the  standardization  of  the  scale,  the  authors  determined  the 
reliabilities  of  the  various  categories  by  the  split-half  method:  Normal 
0.82;  hvstcroid  0.85;  cycloid  manic  0.73;  cycloid  depressed  0.88;  schi¬ 
zoid  autistic  0.88;  schizoid  paranoid  0.70;  and  cpileptoid  0.75. 

(2)  'lest  validity. — Validity  data  were  computed  for  raw  scores,  pro¬ 
file  scores,  and  for  profile  scores  subtracted  from  the  normal  components  I 
on  the  profile  scores.  The  results  are  presented  in  tables  23.4,  23.5,  and  ! 


Taui.e  23.4. —  V alidalion  data  for  a  group  of  pilots  in  primary  training, 
using  the  graduation-elimination  criterion,  for  the  raw  scores 
_ on  the  Humm-lVodmorth  Temperament  Scale,  CE418A ' 


Temperament  categories 

M. 

SD, 

r*u 

Normal  . 

4$  1 

II)  steroid  . 

39  7 

O.Oa 

P.W 

Manic  . 

•“•oil 

Depressive  . . . 

25.4 

K  A 

o./> 

•*.01 

Autistic  . 

Paranoid  . . . 

23  8 

.1) 

Kpilcptoid  . , .  ,  , 

jrtj 

No  count  . ,  .  , 

177  7 

4tO.  JO 

.94 

•it  .  =  202,  p  =.6J. 

•  Significant  beyond  ll  ;  5  percent  level. 


•Gratitude  ii  hereby  expressed  for  the  generosity  of  Dr.  Humm  and  tor  his  cooperation  la 
making  possible  this  validation  study. 
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23.6.  For  these  data,  a  biscrial  correlation  of  0.18  is  required  for  signifi¬ 
cance  at  the  5  percent  level  and  of  0.23  at  the  1  percent  level. 


Table  23.5.—  Validation  data  {or  <i  group  of  pilots  in  primary  training, 
using  the  graduation-diminution  criterion,  for  the  profile  scores 
on  the  Humm-l Vadsworth  Temperament  Scale,  CE418A 1 


Temperament  categoric* 

m. 

M. 

so, 

98.3 

103.0 

97.6 

8.04 

o.os 

*-.19 

10S.S 

8, IS 

95.6 

94.9 

S.8S 

.01 

Depressive  . . . . 

93.6 

93.1 

7.21 

-.04 

96.8 

101.1 

9S.6 

94.7 

101.3 

8.15 

.16 

Paranoid  . . 

8.91 

—.01 

98.1 

7.05 

•-.22 

*  N,  =  200,  pf=0.61. 


1  Significant  beyond  the  S  percent  level. 


Table  23.6. —  Validation  data  for  a  group  of  pilots  in  primary  training, 
using  the  graduation-elimination  criterion,  for  each  profile  score 
subtracted  from  the  normal  component  on  the  profile  score  for  the 
Humm-lVadsuorth  Temperament  Scale,  CE418A ' 


Temperament  categorie. 

M, 

M. 

sn, 

-3.1 

-1.8 

5.93 

0.16 

— 1.8 

-1.6 

5.78 

.02 

-6.6 

-6.S 

6.18 

.01 

-7.4 

-7.5 

5.84 

-.01 

-2.7 

-2.7 

8.41 

.00 

-7.7 

-6.5 

5.31 

.14 

«  N,  =200,  *.=0.63, 


In  treating  the  profile  scores  subtracted  from  the  normal  component  on 
the  profile  score,  a  positive  correlation  means  that  having  a  profile  score 
for  a  temperament  category  that  is  lower  than  the  normal  profile  score 
is  positively  related  to  success. 

(3)  Validity  of  case  summaries. — It  was  stated  earlier  that  Dr.  Ilunini 
supplied  written  case  summaries  of  the  pilot  students  examined.  His 
conclusions  did  not  include  clear-cut  predictions  of  success  or  failure  in 
pilot  training;  in  validating  the  summaries,  therefore,  further  interpre¬ 
tation  was  essential. 

Two  independent  sets  of  ratings  of  Dr.  Ilumm’s  summaries  were  made. 
The  summaries  were  sorted  into  five  categories  by  each  of  two  raters.  A 
rating  of  five  indicates  a  high  probability  of  success,  and  a  rating  of  one 
means  a  high  probability  of  failure. 

The  contingency  coefficient  between  these  two  sets  of  ratings  is  0.77. 
When  converted  to  make  it  equivalent  to  a  product-moment  correlation, 
the  value  is  0.86,  which  indicates  a  high  degree  of  agreement  In-tween 
raters. 

The  distributions  of  ratings  for  Dr.  Ilumni's  summaries  arc  presented 
in  table  23.7.  The  five  categories,  and  the  number  of  graduates  ami 
cliniinecs  falling  into  each,  arc  indicated. 
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Table  237. —  Distribution s  of  graduates!  and  eliminees  according  to  ratings  tnade 
on  the  basis  of  Dr.  llumm's  summaries  for  CE418A 


lUting 

Rating  I  j 

Rating  II 

Graduated 

Eliminated 

Graduated  [Eliminated 

*  (high)  . 

3 

1 

10 

6 

33 

22 

SS 

25 

IS 

19 

II 

6 

26 

11 

30 

11 

i  (low)  . 

24 

20 

32 

21 

Total  . 

120 

73 

120 

21 

The  extent  to  which  the  obtained  frequencies  deviate  from  the  fre¬ 
quencies  expected  by  chance  is  expressed  in  terms  of  chi  square.  Chi 
square  is  1.05  for  ratings  I  and  0.82  for  rating  II.  More  than  90  percent 
of  chance  deviations  would  have  been  as  great. 

(4)  Validation  of  integration  ratings. — The  cases  were  also  rated  ac¬ 
cording  to  Dr.  Iltimm’s  statements  concerning  temperamental  integra¬ 
tion.  These  statements  were  sorted  mto  three  categories,  which  were  de¬ 
scribed  as  well-integrated,  average  integration,  and  poorly  integrated. 
Two  independent  ratings  were  made.  The  numbers  of  cases  passing  and 
failing  for  each  of  these  categories  of  temperamental  integration  are 
given  in  table  23.8 


T/  ble  23.8—  Distribution  of  graduates  and  eliminees  according  to  ratings  made  on 
the  basis  of  Dr.  Humm's  statements  concerning  temperamental  integration, 

for  CE418A 


Category 

Rating  I  1 

Rating  II 

Graduated 

Eliminated 

Graduated 

Eliminated 

Well  integrated . 

21 

IS 

IS 

14 

Average  integration  . 

46 

28 

37 

19 

Poorly  integrated  . 

54 

11 

66 

41 

Total  . . 

121 

74 

121 

74 

For  the  195  cases  on  which  data  are  available,  the  contingency  coeffi¬ 
cient  between  the  two  ratings  is  0.76.  When  converted  to  an  equivalent 
product-moment  correlation,  the  value  is  0.83. 

Again  the  extent  to  which  the  obtained  frequencies  deviate  from  the 
frequencies  expected  by  chance  is  expressed  in  terms  of  chi  square,  Chi 
square  is  0.11  for  rating  I  and  0.34  for  rating  II.  More  than  60  percent 
of  chance  deviations  would  have  been  as  great. 

Evaluation. — No  significant  relationships  were  found  between  pilot 
success  in  primary  flying  school  and  ratings  cither  of  Dr.  Humin’s  anal¬ 
yses  of  temperamental  integration  or  of  his  case  'itmmaries.  Only  two 
of  the  category  scores  yielded  biscrial  coefficients  significant  at  or  beyond 
the  5  percent  level,  the  hystcroid  score  yielding  a  correlation  of  —0.19, 
and  the  cpilcptoid  score  of  —0.22.  If  these  validities  prove  to  hold  up  in 
very  large  samples,  the  two  scores  would  no  doubt  add  a  small  amount 
to  composite  prediction,  since  their  contribution  would  be  unique. 
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Several  objections  arise  in  the  use  of  this  test.  One  specific  objection 
to  this  instrument  is  its  length.  Another  is  the  fact  that  of  the  318  items 
included  in  the  scale,  only  159  are  scored,  which  makes  it  extremely  un¬ 
economical.  The  author’s  contention,  however,  is  that  the  unscored  items 
are  necessary  to  insure  the  validity  of  those  that  are  scored.  The  multi- 
weighted  scoring  is  also  contrary  to  efficient  test  procedures. 

Another  limitation  is  phrased  in  the  words  of  the  authors:  "As  high 
as  25  percent  or  30  percent  of  normal  subjects  may  invalidate  their 
tests.”  (10)  One  important  limiting  factor  in  the  use  of  personality 
tests  is  the  restriction  of  responses  to  two  categories.  It  is  the  stated 
opinion  of  both  examinees  and  administrators  that  lack  of  a  third  cate¬ 
gory  "  ?” — which  provides  an  acceptable  answer-to  an  otherwise  non- 
applicable  item — tends  to  influence  adversely  the  examinee’s  motivation 
toward  the  remaining  items  in  the  test.  The  examinee  may  easily  feel 
restricted  and  forced,  and  if  inclined  at  all  to  be  influenced  by  pride,  will 
more  readily  falsify. 

The  Personal  Audit,  CE431A 

As  a  commercial  test,  this  instrument  was  known  as  the  Adams- 
Lcpley  Personal  Audit.  Experimental  administration  to  obtain  validation 
data  was  undertaken  at  Psychological  Research  Unit  No.  1. 

Description. — This  test  samples  areas  of  personality  that  are  men¬ 
tioned  frequently  in  connection  with  the  Rorschach,  the  Thematic  Ap¬ 
perception  Test,  and  other  projective  techniques.  It  was  felt  that  this 
instrument  would  furnish  a  more  highly  structured  approach  to  these 
various  areas  of  personality  than  is  afforded  by  projective  techniques, 
and  thus  serve  as  a  control  instrument,  the  validity  of  which  would  be 
compared  with  the  validities  of  projective  techniques. 

(1)  Internal  characteristics. — The  test  consists  of  9  sections  of  50 
items  each.  For  administrative  convenience,  the  test  is  divided  into  three 
parts.  Each  of  the  sections  is  designed  to  sample  a  specific  personality 
area,  so  that  nine  areas  are  tapped. 

In  part  I,  scores  arc  derived  for  sociability,  suggestibility,  and  irri¬ 
tability.  In  the  section  on  sociability  (extroversion),  the  examinee  indi¬ 
cates  the  degree  of  his  liking  for  each  of  50  activities.  Responses  .are 
indicated  in  four  degrees:  A — like  it  a  great  deal,  B — some  liking  for 
it,  C — little  liking  for  it,  and  D — practically  no  liking  for  it.  Sample 
items  arc:  "Raising  money  for  a  charity”;  and,  "Watching  a  big  fire." 

In  the  section  on  suggestibility  (a  tendency  to  agree  with  authority), 
the  examinee  is  told  that  at  least  half  of  a  group  of  experts  agreed  that 
each  of  the  50  items  is  true.  Three  degrees  of  responses  are  afforded: 
A — complete  agreement  with  decision  of  experts,  B — agree,  but  with 
reservations,  and  C — disagreement  with  experts.  Sample  items  arc: 
"Majority  rule  is  safest  in  the  long  run”;  and,  "No  cultured  person 
would  ever  use  profanity.” 
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In  the  section  on  irrilaftility  (susceptibility  to  annoyance),  a  number 
of  common  annoyances  is  listed.  The  examinee  indicates  one  of  four 
degrees  of  annoyance :  A — much  annoyance,  B — some  annoyance,  C — 
a  little  annoyance,  and  D  —never  any  annoyance.  Sample  items  arc:  "To 
have  someone  read  over  your  shoulder";  and,  "To  have  to  wait  in  a  long 
line  to  see  a  show." 

Part  If  contains  sections  on  tendency  to  rationalize,  anxiety,  and 
sexual  conflicts. 

For  each  item  in  the  section  on  tendency  to  rationalize  (tendency  to 
make  alibis  and  excuses)  the  examinee  indicates:  A  —statement  is  true,* 
R—  doubt  the  truth  of  the  statement,  and  C — statement  is  usually  false. 
Sample  items  are:  "Many  good  athletes  are  poor  students";  and  “Still 
waters  run  deepest." 

A  list  of  common  fears  thought  to  be  experienced  by  all  persons  to 
some  extent  is  presented  in  the  section  on  anxiety.  The  examinee  indi¬ 
cates  his  degree  of  fear:  A — considerable  fear,  B — some  fear,  C — a  little 
fear,  and  D  —  no  fear.  Sample  items  arc:  “Having  a  physician  give  you 
the  wrong  medicine";  and,  "Losing  your  mind  or  becoming  insane." 

The  section  on  sexual  conflicts  consists  of  a  type  of  controlled  word- 
association  test.  For  each  item,  the  examinee  has  four  alternative  re¬ 
sponses  from  which  to  choose.  Sample  items  follow: 

Ivove  A.  adore  tt.  esteem  C.  worship  D.  yearn 

Rape  A.  attack  B.  assault  C.  ruin  D.  temptation 

Part  III  of  the  tests  contains  sections  on  personal  intolerance,  flexi¬ 
bility  of  attitudes,  and  obsessive  thoughts. 

hi  the  section  on  personal  intolerance,  the  examinee  indicates  the  ex¬ 
tent  of  his  dislike  for  certain  activities  or  things:  A — a  great  deal  of 
dislike,  B  — some  dislike,  C — little  dislike  and  D — no  dislike.  Sample 
items  are:  "People  who  are  stingy  with  their  money";  and,  "A  dirty 
hobo  who  asks  you  for  a  dime.” 

The  section  on  flexibility  of  attitudes  contains  a  list  of  activities  and 
things  for  which  the  examinee  is  to  indicate  his  present  feelings  com¬ 
pared  with  those  of  3  or  4  years  ago  :  A — indicates  feeling  the  same  as  3 
or  4  years  ago,  R — indicates  feeling  partly  changed,  C — feeling  is  con¬ 
siderably  ditTerent.  Sample  items  arc:  "Socialization  of  medicine”;  and, 
"Capital  punishment." 

In  the  section  on  obsessive  thoughts  or  worry  about  unsolved  prob¬ 
lems  the  examinee  indicates  the  amount  of  thinking  he  has  done  about 
certain  topics.  Sample  items  arc:  "Kissing  or  petting  between  young 
mm  and  women";  and,  "Being  demoted  or  discharged  from  a  job." 

(2)  .tihiiinislttili-'n.  Administration  is  in  three  parts,  to  correspond 
to  the  parts  of  the  test.  Five  minutes  arc  allowed  for  each  section,  and 
each  new  section  is  begun  simultaneously  by  the  whole  group.  Approxi¬ 
mately  45  minutes  are  required  to  complete  the  test.  Specific  directions 
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for  answering  items  are  given  at  the  beginning  of  each  section.  Pertinent 
general  directions  are: 

This  is  not  a  test  of  your  intelligence  or  of  your  ability;  there  arc  no  right  or 
wrong  answers.  Cadets  who  have  engaged  in  different  kinds  of  work  have  a  wide 
range  of  attitudes  and  interests.  For  example,  one  person  might  be  interested  in  cer¬ 
tain  things  that  may  not  interest  another  person  at  all.  In  answering  the  questions  in 
this  survey,  be  careful  to  avoid  indicating  one  response  when  you  mean  another. 
There  arc  no  catch  items.  Work  rapidly,  omit  no  responses. 

(3)  Scoring. — Scoring  was  accomplished  for  each  of  the  nine  sections 
by  means  of  the  authors’  a  priori  keys.  In  addition,  one  of  the  authors 
made  predictions  of  pilot  success  on  a  nine-point  scale.*  The  predictions 
were  based  on  the  test  scores,  expressed  in  stanine  form.  Each  stanine 
score  was  given  a  rating  of  either  1  or  0,  so  that  the  nine  trait  scores 
could  be  stimulated  into  a  total  prediction  score  ranging  from  zero  to 
eight.  For  example,  for  the  tests  of  sociability  ami  flexibility  of  alti¬ 
tudes,  a  stanine  of  four  or  above  became  one,  and  other  scores  became 
zero.  For  the  other  sections  of  the  test,  a  stanine  of  four  or  1>clow  l>c- 
came  one  in  the  aggregate  weighting. 

Statistical  results. — The  test  was  validated  on  a  group  of  271  pilots  in 
class  43C,  originally  tested  at  Psychological  Research  Unit  No.  1  from 
May  31  to  June  27,  1943. 

(1)  Test  validity. — As  explained  above,  the  clinical  predictions  on  a 
nine-point  scale  of  success  or  failure  in  primary  pilot  training  were 
validated.  These  data,  along  with  validity  coefficients  for  the  nine  section 
scores,  arc  presented  in  table  23.9. 


Table  23.9. —  Validation  data  [or  scores  and  clinicat  predictions  [or  the  Personal 
Audit,  CV.431A,  for  a  group  of  pilots9  in  primary  /rainiH<7,  using  the 
■raduation  cliinimtion  criterion 


S<  3r- 

M. 

sn, 

r*u 

$.29 

5  54 

1.69 

-0  08 

3.91 

4.56 

1.9$ 

-.12 

4.04 

4.07 

2  00 

-.01 

4.12 

4.04 

I.9J 

.02 

4.05 

J.82 

1.9$ 

.06 

4.11 

4.14 

1.79 

-.01 

4.11 

4.2$ 

I  *8 

-.04 

4  09 

3.7$ 

1  90 

.09 

J.94 

4.21 

1.91 

-.07 

1.92 

$.89 

4.88 

01 

•N,  =  271.  9f  =  0.89. 


The  biserial  coefficients  range  from  -0.12  to  0.09,  which  arc  well 
within  the  range  to  lie  expected  of  a  change  distribution  of  bisorial  cor¬ 
relations,  the  true  mean  of  which  is  zero.  It  would  require  a  biscrial 
coefficient  of  0.20  for  significance  at  the  5  jxrrccnt  level  and  a  biserial 
of  0.26  for  significance  at  the  1  percent  level. 

(2)  Inlcrcorrclations. — Several  intercorrclations  were  determined  be¬ 
tween  the  test  author’s  over  all  predictions  and  the  clinical  predictions' 

'Prediction*  mailt  hr  Maj.  William  t.rplrr-  The  nmrpoi.it  mint  acalr  it  draenbed  In 
chapter  24. 
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of  success  made  in  several  of  the  clinical-procedures  techniques  (see  eh. 
24).  These  irtereorrelations  are  presented  in  table  23.10. 

Tabu  2J.10.—  Corn  lotions  of  author's  overall  ratings  bated  on  (he  Personal  Audit, 
CE431A,  Unth  clinical  predictions  from  other  techniques,  for  a  group  of  pilots  in 

primary  training 


Technique 

N 

t 

189 

0.01 

•IJ 

HoiKhich  (CF.70I-I)  . . . 

189 

188 

-.09 
—  .01 

Interaction  Tnl  (fF.IHA)  . 

Ot>«er«elion  doling  I’lychomolor 

tot  Reel  Period  (CE709A)  . 

us 

175 

-M 

These  intcrcorrclations  indicate  practically  no  communality  between 
techniques. 

Evaluation. — Uccausc  of  the  low  validities  —0.12  to  0.00,  which  could 
have  occurred  entirely  by  chance,  for  both  the  over-all  ratings  and  for 
the  trait  scores,  it  is  concluded  that  this  test  is  of  no  value  in  predicting 
pilot  performance  in  primary  training. 

It  is  probable  that  this  instrument,  along  with  the  majority  of  person¬ 
ality  inventories,  does  not  succeed  in  predicting  air-crew  success,  be¬ 
cause  they  were  not  designed  specifically  for  that  task.  That  other  verbal 
temperament  tests,  not  very  different  in  form,  but  based  upon  some  defi¬ 
nite  hypothesis  concerning  air-crew  qualities,  do  succeed,  is  evidence  in 
support  of  this  conclusion  (e.  g.,  see  the  discussion  of  the  Satisfaction 
Test,  CE4D9D,  in  ch.  25). 

The  Bernreuter  Personality  Inventory,  CE433A 

As  part  of  the  program  of  relatively  systematic  experimental  usage  of 
existing  commercial  personality  inventories,  this  test  was  also  adminis¬ 
tered  at  Psychological  Research  Unit  No.  1. 

Description. — The  standard  commercial  form  of  the  instrument  (2) 
was  used  in  group-test  administration.  It  consists  of  125  items,  which 
are  answered  "yes,"  “no,"  or  "?."  Typical  items  are: 

Do  your  interests  change  rapidly? 

Do  you  usually  try  ‘.o  avoid  dictatorial  or  "bossy"  people? 

Do  people  ever  come  to  you  for  advice? 

(1)  Administration — Twenty-five  mintucs  were  allowed  for  comple¬ 
tion  of  the  items.  The  administrative  directions  consisted  of  an  explana¬ 
tion  of  he  method  of  employing  the  answer  sheet  and  the  following 
general  remarks: 

The  questions  on  tins  blank  are  intended  to  indicate  your  interests  and  attitudes. 
It  is  not  an  intelligence  tC't,  nor  are  there  any  right  cr  wrong  answers 

(2)  Scoring. — The  keys  provided  for  the  test  were  not  employed.  In¬ 
stead,  an  itcm-vnlidatioi.  study  was  undertaken.  It  was  intended  that  if 
a  sufficient  number  of  items  proved  valid,  the  author’s  keys  would  then 
be  validated. 
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Statistical  results. — The  data  which  follow'  were  computed  for  a  sam¬ 
ple  of  pilots  tested  in  Tune  I°H  at  Pstcholngical  Research  Unit 
N  1. 

(1)  Item  validation  -A  disli ihution  of  phis  is  presented  in  table 
2.1.11,  baser!  on  data  from  600  graduates  and  200  eliminecs  from  primary 
pilot  training.* 

Tabi.e  2J.lt. —  Distribution  of  fhis  based  on  validation  of  responses  against  the 
graduationrlamiuition  i  rilerioii  in  primary  training  for  the  Rcrnrevler 
Personality  Inventory,  CF.43J.V 


Phi  I 

~OOS  lip* .  i 

00J-0.07  .  24 

-0  02  -  002  .  M 

-0.07-0.0J  .  16 

-...-0.08*  .  4 

Tolat  .  108 


,_Sr,  —800,  P,-0.7S. 

•The  i.  gheil  pojiiive  phi  was  0.10. 
•The  lowest  ncjttivc  phi  was  — AlO. 


In  interpreting  these  phi  coefTicients,  it  can  lie  said  that  for  an  N  of 
S00,  a  phi  of  0.07  is  significant  at  approximately  the  5  percent  level,  and 
a  phi  of  0.09  is  significant  at  the  1  percent  level  of  confidence.  Altogether, 
six  phis,  two  positive  and  four  negative,  reached  or  exceeded  the  5  per¬ 
cent  level  of  significance.  Two  attained  the  1  jxrcent  level  of  significance. 

Evaluation  —  On  the  basis  of  the  very  small  number  of  phis  attaining 
statistical  significance,  it  was  decided  that  this  instrument  contains  an 
insufficient  number  of  valid  items  for  the  prediction  of  primary  pilot 
success  to  make  further  scoring  measures  or  statistical  analyses  worth 
while. 

An  Inventory  of  Factors  S  T  D  C  It,  CE131A 

This  inventory  was  developed  as  a  commercial  instrument  on  the  basis 
of  factor-analysis  studies  of  various  personality-questionnaire  items.  Ex- 
[xrimcniai  administration  was  undertaken  in  a.,  attempt  to  assess  the 
validity  of  the  inventory  for  pilot  selection. 

Description. — The  factors  S,  T,  D,  C,  and  R  taken  together  probably 
cover  the  area  of  personality  generally  encompassed  by  the  concept  of 
hit  ro  version -extroversion.  According  to  the  author  (4),  each  factor  ac¬ 
tually  represents  a  dimension  of  person  dily  with  two  opposite  piles.  The 
factors  may  lx-  described  as  follows: 

S— Social  introversion,  as  exhibited  in  shyness  and  tendencies  to  with¬ 
draw  from  social  contacts. 

T  -Thinking  introversion,  an  inclination  to  meditative  thinking,  phi¬ 
losophizing,  ami  analyzing  one’s  self  and  others. 

•  Only  "ye*"  retfwntet  w-  t  uiitit.4  It  ull/tw*  tkt  tlilnVuUtw 
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I) l>epresM*<n.  including  fc<  lings  of  unworihiness  and  guilt. 

« '  ('veil ii<l  h  ndt  ncii-s.  ns  shown  i:i  strong  emotional  reactions,  fluc¬ 
tuation*.  in  mood,  and  tendency  toward  Mightiness  or  instability. 

R  -Rh.ithvmia.  a  happy  go-lucky  or  carefree  disposition,  liveliness 
and  impulsiveness. 

( 1 )  Internal  characteristics. — The  inventory  consists  of  175  items, 
each  of  which  is  to  he  answered  by  “yes,”  or  “no.”  Some  of  the 
items  are  scored  for  only  one  factor,  others  arc  scored  for  several,  as  in 
the  sample  items  which  follow : 

Sample  1 :  Do  you  express  yourself  more  easily  in  speech  than  in  writing?  (If 
answered  “no,"  this  item  has  an  S  value  of  1.) 

Sample  2:  Are  you  inclined  to  act  or.  the  spur  of  the  moment  without  thinking 
things  over?  (If  answered  ''yes"  this  item  has  an  R  value  of  2.) 
Sample  3:  Are  you  inclined  to  be  moody?  (If  answered  “yes,"  this  item  has  a  value 
of  1  for  factors  T,  D,  and  C.) 

(2)  A<t ministration. — Pertinent  directions  are: 

Read  each  question  in  the  test  booklet  in  turn,  think  what  your  behavior  has 
usually  been,  and  mark  that  answer  space,  after  the  corresponding  item  number,  that 
describes  your  behavior  best.  Mark  the  "?”  only  when  you  are  unable  to  decide  be¬ 
tween  the  “yes"  and  “no."  Be  sure  to  answer  every  question.  There  is  no  implica¬ 
tion  of  right  or  wrong  in  any  of  these  questions  ♦  •  * 

1 3)  Scoring. — The  inventory  is  scored  by  means  of  one  key  for  each 
of  the  five  factors.  The  keys  for  factors  S,  D,  and  C  give  all  significant 
responses  a  weight  of  one  point,  and  those  for  factors  T  and  R  weight 
some  responses  two  points. 

Statistical  results. — Experimental  administration  of  this  instrument 
was  completed  at  Psychological  Research  Unit  No.  1  in  May  1944  with 
approximately  1,100  pilots  who  took  primary  training. 

(1)  Test  reliability. — Reliabilities  were  not  computed  for  aviation 
students.  The  test  author  cites  reliabilities  obtained  by  combining  alter¬ 
nate  sixths  of  the  items  into  two  pools  of  approximately  equal  lists  and 
the  use  of  the  Spearman-Brown  formula.  This  procedure  yielded  esti¬ 
mated  reliabilities  of  0.92,  0.89,  0.91,  0.91,  and  0.89  for  factors  S,  T,  D, 
C,  and  K,  respectively,  in  a  sample  of  200  college  undergraduates  selected 
at  random  from  a  criterion  group. 

(2)  Iutercorrelatians. — The  intcrcorrclations  of  the  five  factor  scores 
are  presented  in  table  23.12. 


Taui.k  23. \2.~lntcr  cor  relations  of  the  five  factor  scores  obtained  on  the  Inventory 
of  factors  S  T  D  C  R,  CE434A 1 


Factor 

S 

T 

D 

C 

R 

0.24 

0.S8 

0.S8 

—  0.S1 

0.24 

... 

.SI 

.48 

-.21 

.SI 

,51 

91 

,|| 

■  IS 

.48 

.91 

.16 

-.SI 

-.21 

-.11 

.16 

•  •  * 

•N,  =  1,106  pilot*  in  primary  trairuuc. 
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(31  Test  validity.—  The  five  factor  keys  published  with  the  tost  wore 
UM'd  for  scoring.  Validity  data  are  presented  in  table  23.13. 

Table  23. U. —  I'o/idatiou  data  for  a  group  of  pilots  iw  p> inta> v  training,  using  the 
graduation-elimination  criterion,  for  the  Inventory  of  Factors  S  T  F>  C  R,  CF.434' 


• 

55 

SD, 

ru. 

12.63 

32.54 

1 2.93 
19.28 

47.23 

12.97 

33.83 

13.93 

20.38 

46.79 

7.37 

**.71 

8.17 

9.62 

10.10 

-0.03 

•-.09 

‘-.07 

•-.07 

.03 

*  N,  =  1,106,  #>#=0.77. 

*  Corrected  io  an  unrestricted  augmented  alanine  standard  deviation  of  2.10. 

*  Significant  at  the  1  percent  level. 

4  Significant  at  the  $  percent  level. 

(4)  Item  validity. — The  distribution  of  phis  based  on  item  analysis 
data  used  in  the  cross-validation  study  is  presented  in  table  23.14.  These 
data  are  based  only  on  the  “ves”  responses. 

Table  23.14. —  Distribution  of  phis  based  on  samples  of  pilots  in  primary  training, 
using  the  graduation-elimination  criterion,  for  the  Inventory  of  Factors  S  T  D  C  R, 

CE434A 


Phi 

/  (oddi)< 

f  (event)* 

0.13  to 

0.17  . 

1 

1 

0.08  to 

0.12  . 

9 

9 

0.03  to 

0.07  . 

27 

37 

-0.02 

to  0.02  . . 

so 

58 

-0.07 

to  -0.0J  . 

43 

36 

-0.12 

to  -0.08  . . . 

19 

10 

-0.17 

to  -0.13  . 

1 

2 

-0.22 

to  -0.18  . 

1 

0 

Tout  . 

1SI 

IS) 

»N  =  S26. 
*N=Sl«. 


I*’or  an  N  of  520,  a  phi  of  0.09  is  significant  at  approximately  the  5 
percent  level,  and  a  phi  of  0.12  is  significant  at  the  1  percent  level.  Six¬ 
teen  phis  in  the  odds  sample  (7  positive  and  9  negative)  reached  or  ex¬ 
ceeded  the  5  percent  level  of  significance,  with  5  of  these  ( 1  positive  and 
4  negative)  reaching  or  exceeding  the  1  percent  level.  For  the  evens 
sample  26  phis  (9  positive  ami  17  negative)  reached  or  exceeded  the  5 
percent  level,  and  8  of  these  (2  positive  and  6  negative)  reached  or  ex¬ 
ceeded  the  1  percent  level. 

(5)  Cross-validation  data. — Cross-validation  data  were  also  computed 
for  this  sample,  which  was  split  into  two  groups,  odds  and  evens.  Sepa¬ 
rate  item  analyses  were  accomplished  for  each  subsample,  and  two  scor¬ 
ing  keys  devised.  The  criteria  for  scoring  a  response  were:  (1)  a  phi 
significant  at  or  beyond  the  5  jK’rcent  level  (0.09)  and  (2)  a  split  of 
85-15  or  better.  Tins  is  standard  procedure  for  many  of  the  tests  to  fol¬ 
low  in  this  chapter  and  in  others. 

The  evens  group  was  scored  with  the  odds  key,  and  the  odds  group 
was  scored  with  the  evens  key.  The  validities  obtained  arc  presented  in 
table  23.15. 


591 


.1 


! 


Tabl£  23.15.—  Cross-validation  data  based  on  two  groups  of  pilots  tn  primary  train¬ 
ing  using  a  graduation-elimination  criterion,  for  the  Inventory  of  factors 
S  T  D  C  R,  CE434A1 


Group 

Key 

Score 

M, 

M. 

sn, 

f»ll 

Odd**  . 

Evens  . . . 

Rilhts4  . . 

14.29 

14.58 

2.98 

-0.02 

0.05 

Wfonis  . . . 

5.25 

5.20 

2.14 

.01 

-.06 

R-W  ... 

9.04 

9.18 

4.60 

-.02 

.04 

8.62 

8.27 

1.99 

.10 

.10 

Wrongs  . . 

6.82 

6.57 

2.SS 

.06 

.11 

R-W  ... 

1.79 

1.70 

3.84 

.01 

-.05 

V,  =0.?9  for  both  group*. 

*  Corrected  to  *n  unrestricted  augmented  sunine  standard  deviation  of  2.10. 

•N,  =  S26;  number  of  scored  items  =  27. 

•  Rights  mean  positively  keyed  responses  and  wrongs  mean  negatively  keyed  response*. 

*N,  =  il6;  number  of  scored  items  =  19. 

Evaluation. — On  the  basis  of  the  validity  study,  it  appears  that  the 
inventory  of  factors  S  T  D  C  R  is  not  promising  for  predicting  gradua- 
tion-cleinination  from  primary  pilot  training.  The  validities  of  factor 
scores  for  D  (—0.07)  and  C  (—0.07)  arc  both  significant  at  the  5  per¬ 
cent  level  and  for  factor  T  (—0.09)  at  the  1  percent  level.  Thus  nega¬ 
tive  relationships  with  thinking  introversion,  depression,  and  cycloid 
tendencies  probably  exist  but  are  too  low  to  be  useful  in  selecting  pilots. 

There  would  seem  to  be  an  excess  number  of  valid  responses  beyond 
the  confidence  limits,  but,  in  view  of  the  apparent  unimodality  of  the 
distribution,  with  its  central  tendency  at  zero,  and  the  failure  of  the 
cross-validation  test  (sec  table  23.15),  it  would  appear  that  there  are 
very  few  genuinely  valid  items  for  pilot  training  in  the  inventory. 

The  relatively  high  intcrcorrclation  between  several  of  the  factors 
seems  to  indicate  that  there  is  some  overlapping  in  the  items.  Thus,  for 
factors  D  and  C,  r=0.91,  while  for  D  and  S,  and  D  and  T,  the  correla¬ 
tions  arc  0.58  and  0.51.  The  latter  arc  tolerable  and  may  represent  the 
actual  degree  of  correlation  between  the  factors,  but  the  correlation  be¬ 
tween  C  and  D  is  so  high  as  to  demand  a  revision  of  the  keys  for  these 
two  factors  for  an  aviation-student  population.  These  intcrcorrelations 
are  quite  similar  to  those  obtained  by  the  test  author  on  a  college  under¬ 
graduate  sample,  with  two  exceptions:  in  the  college  sample  rDT— 0.15 
and  rt-r=0.14,  while  the  corresponding  values  for  the  aviation  student 
population  were  0.51  and  0.48,  respectively. 

The  Guilford-Martin  Personnel  Inventory,  CE436A 

This  commercially  provided  instrument  was  designed  for  two  primary 
purposes.  First  of  all,  it  was  designed  to  assist  supervisors  of  workers  in 
business  and  industry  in  detecting  and  diagnosing  those  individuals  who 
are  personally  maladjusted  in  their  jobs,  particularly  those  who  are  dis¬ 
contented  and  likely  to  become  troublemakers.  Secondly,  the  test  was  de¬ 
signed  to  extend  the  list  of  traits  of  temperament  already  assessed  by 
the  Inventory  of  Factors  S  T  D  C  R,  CE434A.  The  area  covered  by 
this  inventory  may  be  roughly  designated  by  the  term  "paranoid,”  though 
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only  the  extreme  symptoms  deserve  that  appellation  borrowed  from  psy¬ 
chopathology. 

Description.— The  authors  hypothesize  that  there  are  several  some¬ 
what  related  aspect;,  of  the  paranoid  disposition.  These  aspects  may  be 
described  as  (1)  subjectivity  (taking  things  personally;  ideas  of  refer¬ 
ence;  touchiness),  (2)  belligerence  (domineering  attitude;  craving  for 
superiority),  (3)  suspiciousness,  and  (4)  faultfinding  or  hyper-critical- 
ness.  In  setting  up  the  lists  of  items  diagnostic  of  these  traits,  it  was 
found  that  the  last  two  could  not  be  scored  with  sufficient  independence 
to  justify  separate  keys.  The  list  of  traits  measured  by  the  inventory, 
therefore,  reduces  to  three.  Using  the  names  of  the  more  favorable  end 
of  the  scale  in  each  instance,  they  are; 

O — Objectivity  (as  opposed  to  personal  reference  or  a  tendency  to 
take  things  personally). 

Ag — Agrccablcncss  (as  opposed  to  belligerence  or  a  dominating  dis¬ 
position  and  an  overreadiness  to  fight  over  trifles). 

Co — Cooperativeness  (as  opposed  to  faultfinding  or  ovcrcriticalness 
of  people  and  things). 

(1)  Internal  characteristics. — The  inventory  consists  of  150  itemr.. 
The  questions  arc  to  be  answered  by  cither  “yes,”  "  ?,”  or  “no.”  All  but 
eight  of  the  questions  yield  scores  for  one  or  more  of  the.  three  factors. 
Examples  of  questions,  one  for  each  of  the  scoring  categories,  follow : 

O— Are  you  inclined  to  be  thinking  about  yourself  much  of  the  time?  (Answer  of 
“No”  is  significant  for  O.) 

Ag— Are  you  annoyed  when  people  tell  you  how  you  should  do  a  thing?  (Answer 
of  “No"  is  significant  for  Ag.) 

Co — Docs  it  seem  to  you  that  human  beings  hardly  ever  learn  to  avoid  making  the 
same  mistakes  twice?  (Answer  of  ’"no"  is  significant  for  Co.) 

(2)  Administration. — Thirty  minutes  arc  sufficient  for  completion  of 
the  150  items.  Essential  comments  from  the  directions  are: 

Read  each  question  in  turn,  think  what  your  behavior  has  usually  been,  then 
blacken  your  answer  sheet  *  *  *  Answer  by  ”?"  only  when  you  are  unable  to 
decide  between  the  Yes  and  No.  There  is  no  right  answer  to  any  of  these  questions 
except  the  answer  that  tells  how  you  think  or  feel  about  it  *  *  • 

(3)  Scoring. — Each  item  may  be  scored  for  one  or  more  of  the  three 
factors.  In  this  case,  also,  the  sample  was  split  into  odds  and  evens,  or 
groups.  Separate  item  validations  were  accomplished  on  these  two  sub¬ 
samples,  and  two  scoring  keys  were  made.  The  criteria  for  scoring  a 
response  were:  (l)a  phi  significant  at  or  l>eyond  the  5  percent  level 
(0.10)  and  (2)  a  split  of  90-10  or  better.  It  is  interesting  to  note  that 
there  is  practically  no  correlation  between  the  odd  and  even  keys  and 
that  only  13  scored  responses  arc  held  in  common. 

Statistical  results. — Data  arc  available  for  approximately  950  pilots, 
originally  tested  in  May  1944  at  Psychological  Research  Unit  No.  1, 
who  took  primary  training. 
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(1 )  Test  reliability.— HvWahWiUvi  were  not  computed  for  aviation  stu¬ 
dent*.  Reliabilities  (comparable  halves)  cited  by  the  authors  of  the  test 
arc :  O,  0.83 ;  Ag,  0.80 ;  and  Co,  0.0 1 . 

(2)  htlercorrelaliotts. — The  intcrcorrelations  of  the  three  scores  arc 
presented  in  table  23.16. 


Table  23.16. —  Intcrcorrelations  of  tlic  three  scores  obtained  on  the  Guilford-Martin 

Personnel  Inventory,  CE436A1 


Factor 

O 

A* 

Co 

0  . 

0.60 

0.62 

.62 

Ac  . . . 

0.60 

Co  . 

.62 

.62 

S’,-945  tlisiifict!  pilots  in  primary  training. 


The  trait-score  intcrcorrelations  arc  high,  which  is  to  be  expected,  as 
all  three  traits  are  attempts  to  measure  paranoid  temperament.  At  the 
same  time  they  arc  sufficiently  low  to  provide  for  differential  measure¬ 
ment.  These  data  closely  approximate  the  intcrcorrelations  obtained  by 
the  test  authors  on  a  civilian  sample. 

(3)  Test  validity. — Validation  results  based  on  a  sample  of  915  pilots 
in  primary  training  are  presented  in  table  23.17.  The  three  factor  keys 
published  with  this  test  were  used. 


Table  23.17.—  Validation  data  for  a  group  of  pilots  in  primary  training  using  the 
graduation-elimination  criterion  for  the  Guilford-Martin  Personnel  Inventory, 

CF.436A 1 


Factor 

M, 

SD, 

fMl 

O  . 

52.27 

S0.2S 

11.78 

*0.10 

016 

An  . 

30.62 

28.70 

9.23 

*.12 

.13 

Co  . 

60.38 

56.47 

16.19 

‘.14 

.19 

'N,=945.  f#=0.79. 

1  Corrected  lo  an  unrestricted  augmented  alanine  standard  deviation  of  2.10. 

’  Significant  Iwyoml  ihe  5  ncrccnt  level. 

'  Significant  at  or  beyond  the  I  |>ercem  level. 

(4)  Item  validity. — After  dividing  the  sample  of  answer  sheets  into 
random  halves,  the  responses  to  the  items  were  correlated  with  the  grad- 
uation-elcmination  criterion  from  primary  pilot  training.  The  distribu¬ 
tions  of  the  phi  coefficients  arc  shown  in  table  23.18. 


Table  23.18. —  Distribution  of  validity  phis  based  on  item  analvsis  of  the  Guilford- 
Martin  Personnel  Inventory,  CF.436A 


Old  to  0.17  .... 
0  08  to  0.12  ... 
0  03  to  0.07  ... 
-  0.0’  to  0  02  . 
-0.07  to  —  0.03 
-0.12  to  -0.08 
-0.17  to  -0.11 


Total 


Phi 


f  (odds)* 


/  («•»««)' 


3 

12 

17 
45 

18 
14 

6 


ns 


2 

6 

20 

49 

36 

1$ 

I 


140 


'  N  —  464. 
*  N  =  407. 
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For  an  N  of  468,  a  phi  of  0.09  is  significant  at  approximately  the  5 
percent  level  of  confidence,  and  a  phi  of  0.12  is  significant  at  the  1  per¬ 
cent  level.  There  are  31  phis  in  the  odds  sample  (15  positive  and  16 
negative)  that  reached  or  exceeded  the  5  percent  level  of  significance, 
with  14  of  these  (5  positive  and  9  negative)  reaching  or  exceeding  the 
1  percent  level.  For  the  evens  sample  19  phis  (7  positive  and  12  nega¬ 
tive)  reached  or  exceeded  the  5  percent  level  of  significance;  5  of  these 
(2  positive  and  3  negative)  reached  or  exceeded  the  1  jicrccnt  level. 

(5)  Cross-validation  data. — Cross-validation  data  were  obtained,  as 
shown  in  table  23.19. 


Table  23.19. —  Validity  data  based  on  tivo  groups  of  pitots  in  primary  training, 
using  the  graduation-elimination  criterion,  for  the  Guilford-Afartin 
Personnel  Inventory,  CF.43AA 


Groups 

Key 

Score 

m. 

sr>, 

rm 

Odds'  ... 

Evens  . 

Right**  .... 

15.45 

12.08 

2.48 

*0.1) 

0.12 

Wrongs  .... 

4.40 

4.81 

1.76 

-.15 

-.14 

R-W+20  . 

29.04 

28.12 

3.60 

*.14 

.14 

36.  JJ 

54.70 

7.07 

•II 

.21 

Wrongs  . . . . 

7.01 

6.74 

4.44 

.04 

-.01 

R-W  +  20  . 

49.55 

47.88 

8.74 

.10 

.19 

•Odds  sample  N,=466,  #,=0.79;  number  of  scored  items =27. 

1  In  tbit  (able,  rigbl*  means  positively  scored  responses,  wrongs  means  negatively  scored 
responses. 

*  Significant  at  the  S  percent  level. 

‘Evens  sample  N(=467,  p  =0.79;  number  of  scored  items  =  59. 


Evaluation. — On  the  basis  of  the  validity  study,  it  appears  that  the 
Guilford-Martin  Personnel  Inventory  has  some  promise  for  predicting 
graduation-elimination  from  primary  pilot  training.  Tims,  scores  fot 
factors  Ag  (rM,=0.12)  and  Co  (rt(,=0.14)  arc  significantly  related  to 
the  criterion  at  the  1  percent  level  and  for  factor  O  (r»i,=0.10)  at  the 
5  percent  level.  In  view  of  the  fact  that  these  scores  undoubtedly  con¬ 
tribute  something  different  from  the  classification  battery,  they  would 
make  some  addition  to  its  predictive  value  for  pilot  selection. 

On  inspection,  it  may  lie  seen  that  each  empirical  key  is  valid,  yet  little 
or  no  correlation  was  shown  between  keys.  This  fact  may  lie  due  to  sam¬ 
pling  fluctuation.  It  is  recommended  that  a  combined  key  !>c  used  to  score 
future  papers  of  this  test  for  pilot  selection. 


Inventory  of  Factors  G  A  M  I  N,  CF.435A 

In  the  factor-analysis  approach  to  (he  problems  of  temperament,  sev¬ 
eral  traits  have  been  identified,  and  a  scries  of  inventories  have  been 
constructed  which  effectively  measure  some  of  these  traits.  The  inventory 
of  factors  Ci  A  M  I  X  (6)  adds  five  more  teui|H*rameitt  variables  to  the 
eight  already  measured  by  the  two  other  tests  in  this  series,  n.v  icly,  the 
Guilford-Martin  Personnel  Inventory,  CK43f»A,  and  the  Inventory  of 
Factors  S  T  D  C  K,  CE434A.  Hence  cx|*crinicntal  administration  of 
this  instrument,  for  purposes  of  determining  validity  for  predicting  pilot 
success,  was  undertaken  a»  Psychological  Research  Unit  No.  I. 


! 

1 

i 
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Description. — The  definitions  of  jhc  trait  names  used  in  this  inven¬ 
tory,  as  in  the  preceding  two,  derive  from  factorial  studies  and  of  sub¬ 
sequent  item  analyses.  The  five  traits  measured  here  include: 

G  — General  pressure  for  overt  activity,  or  general  activity  vs.  inactiv¬ 
ity. 

A — Ascendancy  in  social  situations  as  opposed  to  submissiveness; 
leadership  qualities. 

M — Masculinity  of  attitudes  and  interests  as  opposed  to  femininity. 

I  — I-nck  of  inferiority  feelings;  self-confidence. 

N — Lack  of  nervous  tenseness  and  irritability,  or  psycho  somatic  sta¬ 
bility  v.  nervousness  and  jitterincss. 

(1)  Internal  characteristics. — The  inventory  contains  270  items,  each 
calling  for  a  response  of  "yes,”  "?,”  or  "no.”  Examples  of  items  from 
the  keys  for  each  trait  follow : 

For  trait  G. — "Arc  you  inclined  to  be  quick  in  your  actions?” 

"Can  vou  turn  out  a  large  amount  of  woik  in  a  short 
time?" 

For  trait  A. — -"Do  you  find  it  difficult  to  get  rid  of  a  salesman  to 
whom  you  do  not  care  to  listen  or  give  your  time?” 

"Have  you  ever,  on  your  own  initiative,  organized  a 
club  or  group  of  any  kind  ?” 

For  trait  M. — "Do  you  like  love  scenes  in  a  movie  or  a  play?” 

"Do  you  (or  would  you)  like  to  go  hunting  with  a 
rifle  for  wild  game?" 

For  trait  !.  — "Do  you  feel  that  the  average  person  has  made  a  bet¬ 
ter  adjustment  to  life  than  you  have?” 

"Do  you  feel  confident  that  you  can  cope  with  most 
situations  that  you  will  meet  in  the  future?” 

For  trait  N. — "Do  you  often  become  irritated  over  little  annoy¬ 
ances?" 

"Do  you  have  nervous  habits  such  as  chewing  your 
pencil  or  biting  vour  fingernails?” 

(2)  Administration. — All  examinees  finish  in  approximately  45  min¬ 
utes.  Pertinent  directions  are: 

•  •  •  Read  cacti  quernim  in  turn,  think  what  your  behavior  ha*  usually  been. 
Tlnti  on  your  answer  'heel  blacken  the  space  that  describe*  your  behavior  best 
FV  sure  to  answer  every  question.  There  is  no  mrIi!  ati-wcr  to  any  of  these  ques¬ 
tions  except  the  answer  that  tells  how  you  think  or  fvxl  about  it  *  •  • 

(3)  Scoring  --The  authors'  published  keys  were  used.  Scoring 
weights  had  Iwyii  found  for  each  response  to  every  item  by  using  Guil¬ 
ford's  ahtic  method  (5).  litis  procedure  yielded  final  keys  consisting  of 
41  items  for  trait  G.  50  items  for  trait  A,  52  items  for  trait  M,  69 
items  for  trait  !,  an  1  69  items  for  trait  N.  Only  nine  items  are  scored 
for  more  than  one  trait. 


Statistical  results. — Fxperimrntal  administration  of  this  instrument 
was  completed  in  June  19-14  at  Psychological  Research  Unit  No.  1  with 
approximately  780  pilots  who  took  primary  training. 

(1)  Test  reliability. — Reliabilities  were  not  computed  for  aviation 
students.  The  authors’  estimated  reliabilities  secured  by  correlating  com¬ 
parable  halves  of  the  keyed  items  and  correcting  for  length  by  the  Spear¬ 
man-Crown  formula  were:  0.S9  for  trait  G,  0.S8  for  trait  A,  0.85  for 
trait  M,  0.91  for  trait  I,  and  0.S9  for  trait  X,  for  a  sample  of  college 
undergraduates. 

(2)  Inter  cor  relations. — The  intercorrelations  of  the  five  factors  are 
presented  in  table  23.20. 

Taiii.b  23.20.— Inter  correlations  of  the  five  scores  obtained  from  the  Inventory  of 

factors  G  .1  M  I  X,  CF.435A' 

C 

0.50 
.09 
.26 
.10 

■AT,  =  782  pilots  in  primary  training. 

These  intercorrelations  are  quite  comparable  to  those  obtained  on  the 
authors’  original  civilian  data.  Inspection  of  table  23.20  shows  that  factor 
G  is  independent  of  factors  M  and  N.  Factors  A  and  I,  and  I  and  N 
show  a  fairly  high  degree  of  intercorrelation,  yet  the  correlations  are  still 
low  enough  for  each  factor  score  to  provide  differential  measurement.  The 
remaining  intercorrelations  are  moderately  low,  which  is  indicative  of 
some  success  in  the  measurement  of  independent  traits. 

(3)  Test  validity. — Validation  data  based  on  a  sample  of  782  pilot* 
in  primary  training  are  presented  in  table  23.21. 

Tabus  23  21. —  Validation  data  for  a  group  of  pilots  in  primary  training,  using  the 
graduation-elimination  criterion,  for  the  Inventory  of  Factors  G  el  M  /  N,  CF.435A 1 


B 


*.V,  =  7S2,  f#  =  .76. 

•  Cootsitd  lo  an  um«iric«j  uitmnldl  tliniiw  ximlird  Jniilimi  «i  2.10. 

A  biserial  correlation  of  0.10  is  required  for  significance  at  the  5  per¬ 
cent  level  md  of  (1. 13  for  significance  at  the  1  percent  level.  Xone  of  the 
correlations  approaches  significance  at  the  5  percent  level. 

(4)  Item  valiilitv-  Alter  dividing  the  sample  of  answer  sheets  into 
random  halves,  the  responses  to  the  items  were  correlated  with  the  grad¬ 
uation-elimination  criterion  from  primary  pilot  training.  The  disfribu- 
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lions  of  the  phi  coefficients  arc  presented  in  table  23.22.  Only  "yes”  re¬ 
sponses  were  used  in  tallying  these  distributions. 

Table  2X22.—  Distribution  of  validity  phis  based  on  an  item  analysis  of  the 
/memory  of  Factors  CAM!  N,  CF.435A 


Phi 

/  (evens)* 

/  (odd*)* 

0  18  lo  0.22  . 

0 

l 

0.13  to  0.17  . 

2 

6 

0.08  to  0.12  . 

16 

20 

0  05  to  0.07  . . . . . . . 

61 

42 

-0  02  to  0.02  . 

85 

75 

-0  07  lo  -0.03  . . 

54 

46 

-0.12  lo  -0.08  .  . 

II 

21 

-0.17  to  — 0.1  J  . . 

5 

J 

Total  . 

210 

214 

'N= 384.  »(tf  =  375. 

In  interpreting  these  phi  coefficients,  it  can  be  said  that  for  an  N  of 
380,  a  phi  of  0.10  is  significant  at  approximately  the  5  percent  level  and 
a  phi  of  0.14  is  significant  at  the  1  percent  level  of  confidence.  In  the 
odds  sample,  30  phis  (17  positive  ami  13  negative)  exceed  the  5  percent 
level  of  significance,  and  6  of  these  (3  positive  ntuf-3  negative)  reach  or 
exceed  the  1  percent  level.  For  the  evens  sample,' 15  phis  (8  positive  and 
7  negative)  reach  or  exceed  the  5  percent  level  of  significance,  and  5  of 
these  (2  positive  and  3  negative)  roach  or  exceed  the  1  percent  level. 

(5)  Cross-validation  data. — A  cross-validation  study  was  made,  using 
two  empirical  scoring  keys.  The  usual  criteria  of  a  split  better  than 
85-15  and  significance  at  or  beyond  the  5  percent  level  of  confidence 
were  employed  in  constructing  the  scoring  keys.  The  resulting  data  arc 
presented  in  table  23.23. 

Table  23.23. —  Validity  data  based  on  hvo  groups  of  pitots  in  primary  training, 
using  the  graduation -elimination  criterion,  fer  the  Inventory  of  Factors 

GAM  /  N,  CF43SA 


Croup 

Key 

Score 

w. 

sn, 

rk„ 

•rn» 

f*ddt*  ... 

Event  . . . 

Rishtt*  . 

16.7! 

17  00 

205 

-0.10 

-0.10  ] 

Wromj* . . 

7.51 

7.40 

3.08 

-  .07 

-.05  ■) 

R-W  +  20  .... 

49.45 

49.81 

4  9* 

-.05 

-.02  1 

Event*  . . 

fM.lt  ... 

K.^hti  . 

1J.25 

12.96 

3.16 

.06 

.03  1 

iWfotwc*  . 

25.79 

2C0I 

5.51 

.08 

.0*  | 

1 

1 

1 

R-W  +  20  .... 

14.15 

14.54 

7.68 

-.0! 

—  04  3 

1 

1  OJJ»  375,  £t  =  C,7 V;  nomUr  vcorttl  ilcmt'^51. 

•In  lhi»  liWt,  i  Mi  potitivtly  Morcd  mpoOKi,  wrong!  mrmi  n>g)li«tljr  kkH 

infwirt. 

*  Event  tamplc  nunln  o,  Ktrd  — 44. 

For  an  M  of  375  or  384  a  correlation  of  0.14  is  required  for  significance 
at  the  5  percent  level  and  of  0.18  for  significance  at  the  I  percent  level. 

hfo/uation. — On  the  basis  of  the  obtained  validities,  it  appears  that 
ih;  Inventory  of  Factors  GAM IX  h«Jds  practically  no  promise  as  an  in¬ 
strument  for  predicting  graduation-elimination  from  primary  pilot  train¬ 
ing.  None  of  tire  five  categories  gives  a  hi  serial  correlation  that  even 
closely  approximates  significance  at  the  5  percent  level.  This  is  intercst- 


? 
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•  »g  '*orn  one  point  of  view,  namely,  (lie  failure  of  any  sign  of  validity 
for  the  masculinity  score.  This  result  should  be  considered  in  connection 
with  the  hypothesis  regarding  masculinity  mentioned  in  chapter  25. 

Xo  collection  of  items  based  on  item  correlation  with  the  criterion 
seems  any  more  promising.  There  would  seem  to  be  an  excess  number  of 
valid  items  beyond  the  confidence  limits,  but,  in  view  of  the  apparently 
im i modal  distribution  with  its  central  tendency  at  zero  and  the  failure  of 
the  cross-validation  test  (see  table  23.23)  to  show  significant  biscrial 
correlations,  it  is  probable  that  there  are  few,  if  any,  genuinely  valid 
items  in  this  collection  for  the  prediction  of  primary  pilot  training  suc¬ 
cess.  Probably  the  item  validities  merely  represent  a  random  sampling 
around  zero. 

It  might  he  that  this  instrument  would  he  of  some  use  if  the  scores 
obtained  here  were  combined  into  a  composite  profile,  together  with  the 
scores  obtained  on  the  inventory  of  factors  S  T  D  C  R  and  the  Guilford- 
Mnrtin  Personnel  Inventory.  The  scores  for  the  13  factors  might  be 
plotted  on  a  composite  graph,  hv  means  of  which  significant  profile  con¬ 
figurations  of  traits  would  be  revealed.  Such  a  study  would  pro\e  inter- 

o 

csting  and  possibly  fruitful. 


The  Minnesota  Multiphase  Personality  Inventory,  CK437A 

This  study  was  designed  to  determine  whether  the  scores  which  can 
he  derived  from  the  group  form  of  the  Minnesota  Multiphasic  Person¬ 
ality  Inventory  are  related  to  success  in  Hying  training  and  to  predisposi¬ 
tion  to  combat  fatigue.  Administration  for  validation  purposes  was  un¬ 
dertaken  at  Psychological  Research  Unit  Xo.  1  in  July  1944. 

Description. — The  Minnesota  Multiphasic  Personality  Inventory  (8) 
is  an  instrument  that  attempts,  in  one  test,  to  provide  scores  on  all  of  the 
more  important  phases  of  personality.  The  550  items  are  arbitrarily 
classified  under  26  headings  as  follows: 

1.  General  health  (9  items). 

2.  General  neurologic  (19  items). 

3.  Cranial  nerves  (11  items). 

4.  Motility  and  coordination  (6  items). 

5.  Sensibility  (5  items). 

6.  V'asomotor,  trophic,  speech,  secretory  (10  items). 

7.  Cardio- respiratory  system  (5  items). 

8.  Gastro  intestinal  system  (11  items). 

9.  Genito  urinary  system  (5  items). 

10.  Habits  (19  items). 

11.  Family  and  marital  (26  items). 

12.  Occupational  (18  items). 

13.  Educational  (12  items). 

14.  Sexual  altitudes  (16  items). 

15.  Religious  attitudes  (19  items). 
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Ut.  Political  attitudes: — law  and  order  (46  items). 

17.  Social  altitudes  (72  items). 

IS.  AliVci.  depressive  (32  items). 

19.  Affect,  manic  (24  items). 

20.  Obsessive  and  compulsive  states  (15  items). 

21.  Delusions,  hallucinations,  illusions,  ideas  of  reference  (31 

items). 

22.  1’holiias  (29  items). 

23.  Sadistic,  masochistic  trends  (7  items). 

21.  Morale  (33  items). 

25.  Items  primarily  related  to.  masculinity- femininity  (55  items). 

26.  Items  to  indicate  whether  the  individual  is  trying  to  place  him¬ 

self  in  an  improbably  acceptable  light  (15  items). 

(1)  Internal  characteristics. — This  group  form  of  the  test  consists 
of  the  same  550  items  that  were  originally  developed  as  an  individually 
administered,  card-sorting  form.  The  response  to  each  item  is  recorded 

in  one  of  three  categories:  (a)  “true,  or  mostly  true,”  (b)  “not  usually  1 
true,  or  entirely  untrue,”  and  (c)  “cannot  say.”  Sample  items  are:  • 

!  like  to  read  newspaper  editorials. 

Someone  Ima  it  in  for  nie. 

I  do  not  like  everyone  I  know. 

(2)  elaministration. — Pertinent  directions  arc:  i 

In  this  inventory  you  are  asked  for  information  about  your  feelings,  your  likes 
and  dislikes,  anil  a  great  many  other  things.  This  is  not  a  test.  There  are  no  "right" 
answers  except  the  answers  that  tell  the  truth  about  yourself.  To  a  large  extent, 
your  success  in  air-crew  training  will  depend  upon  how  well  you  arc  understood  by 
tho*e  in  charge.  It  is  therefore  to  your  own  interest  to  fill  out  this  blank  carefully 
ami  completely. 

(3)  Scaring.- ■  'rite  authors  have  prepared  keys  for  hypochondriasis, 
hysteria,  depression,  psychastheiiia,  psychopathic  deviation,  masculinity- 
femininity,  paranoia,  schizophrenia,  and  hypomania.  It  was  decided  to 
use  these  keys  only  if  an  item-validity  study  indicated  that  these  keys 
had  jHitential  validity  for  prediction  of  pilot  success  in  primary  training. 

Statistical  results  -  --Data  are  complete  for  856  pilots  in  primary  train- 
ing. 

(1)  Test  reliability  -  -Reliability  data  were  not  computed.  The  au¬ 
thors  have  found  that  the  test-retest  reliability  coefficients  of  their  keys 
range  from  0.71  to  0.83. 

(2)  Item  validity.  The  sample  of  pilots  was  split  into  two  groups, 
odds  and  evens,  each  having  an  N  of  400,  and  separate  item  analyses 
were  accomplished  for  the  2  subsamples.  Distributions  of  phis  for  the 
odds  and  the  evens  group  on  the  basis  of  a  split  of  90-10  or  lietter  is 
presented  in  table  23.2-1.  Only  jmsitive  phis  are  given,  since  the  test 
proved  to  he  essentially  a  set  of  two  choice  items  for  lhis_j)*fpfitati<4ii:a 
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Tabj.e  23.24. —  Distribution  of  phis  based  on  two  samples  of  • 100  pilots  in  primary 
training,  using  life  gniiluatuin-cliininalion  criterion,  for  the  Minnesota  Muitiphasic 

Personality  Inventory,  Cli  fSTA 


Phi 

/  (oittlo) 

/  (*>•«!•>) 

o.is  to  o.i9  . 

2 

6 

0.10  to  0.14  . 

26 

38 

0.05  to  0.09  . 

118 

122 

0.00  to  0.04  . 

205 

182 

Total  . 

351 

348 

For  an  X  of  400,  a  phi  of  0.10  is  significant  at  approximately  the  5 
percent  level  of  confidence,  and  a  phi  of  0.13  is  significant  at  the  1  per¬ 
cent  level.  For  the  evens  group,  4-1  phis  reached  or  exceeded  the  5  per¬ 
cent  level  of  significance,  of  which  19  reached  or  exceeded  the  1  percent 
level  of  significance.  For  the  odds  group,  28  phis  reached  or  exceeded 
the  5  percent  level  of  significance,  of  which  5  reached  or  exceeded  the 
1  percent  level  of  significance. 

Evaluation. — In  view  of  the  apparently  unimodal  distribution  of  phis 
with  a  central  tendency  at  zero  it  nobuble  that  there  are  few,  if  any, 
genuinely  valid  items  in  this  collection  for  the  prediction  of  primary 
pilot  training  success.  These  analyses  are  taken  to  indicate  that  no  validity 
for  the  test  could  result  from  a  cross-validation  study,  and,  accordingly, 
none  was  attempted. 

It  is  to  be  remembered  that  this  instrument  was  developed  for  use  as 
a  clinical  device  in  the  prediction  and  confirmation  of  diagnoses  of  clin¬ 
ical  entities.  It  may  oe  that  it  would  prove  valid  in  predicting  criteria 
such  as  combat  fatigue  It  was  not  possible  to  carry  out  this  type  of 
validation  study.  N 

Minnesota  Personality  Scale,  CE138A 

The  information  cited  here  is  concerned  entirely  with  the  form  for 
men  of  the  Minnesota  PcrsonalityeScale  (3),  which  was  also  adminis¬ 
tered  at  Psychological  Research  Unit  No.  1  in  January  1944. 

Description. — Five  aspects  of  personality  arc  measured:  Morale,  so¬ 
cial  adjustment,  family  relations,  emotionality,  and  economic  conserva¬ 
tism.  These  traits  were  reported  to  have  resulted  from  a  factor  analysis 
of  several  personality  tests,  and  they  are  defined  as  follows: 

a.  High  morale  scores  are  indicative  of  belief  in  the  institutions  and 
future  possibilities  of  society.  Low  scores  usually  indicate  cynicism  or 
lack  of  hope  in  the  future. 

b.  High  social  adjustment  scores  tend  to  be  characteristic  of  the  gre¬ 
garious,  socially  mature  individual  in  his  relations  with  other  people. 
Low  scores  are  characteristic  of  the  socially  inept  or  under-socialized 
individual. 

c.  High  family  relations  scores  usually  signify  friendly  and  healthy 
parent-child  relations.  Low  scores  suggest  conflicts  or  maladjustments  in 
parent-child  relations. 
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( / .  High  emotionality  scores  arc  representative  of  emotionally  stable 
am!  se!f-j»osscsse<!  individuals.  I.mv  Mores  may  result  from  anxiety 
slates  or  over-reactive  tendencies. 

'•  High  economic  conservatism  scores  indicate  conservative  economic 
attitudes.  I.ow  scores  reveal  a  tendency  toward  liberal  or  radical  points 
of  view  on  current  economic  and  industrial  problems. 

(1)  Internal  characteristics. — The  test  is  divided  into  5  sections,  and 
it  consists  of  218  questions.  leach  item  has  five  alternative  responses. 
Part  I  consists  of  questionsr'1-40  dealing  with  morale.  Each  item  is  to  he 
answered  by  one  of  the  alternatives:  (SA  strongly  agree;  (A)  agree; 
(U)  undecided ;  (D)  disagree;  and  (SD)  strongly  disagree.  Sample 
items  are:  (c)  "Court  decisions  are  .almost  always  just”  and  (h)  "There 
is  really  no  point  in  living.” 

Part  II  comprises  items  45-105,  and  deals  with  social  adjustment.  The 
alternatives  from  which  the  examinee  is  to  choose  are:  (AA)  almost 
always;  (E)  frequently;  (O)  occasionally;  (R)  rarely;  and  (AN) 
almost  never.  Sample  items  arc  (a)  "Do  you  have  a  fairly  good  time  at 
parties?”  and  ( h )  "Are  vou  able  to  recover  quick!'-  froni  social  blun¬ 
ders?" 

Part  III  is  concerned  with  family  relations,  and  includes  items  106- 
135.  These  items  are  answered  with  one  of  five  alternatives  of  the  series 
(AA)  almost  always,  through  (AN)  almost  never.  Sample  items  arc: 
(a)  “Do  you  and  your  parents  live  in  different  worlds,  so  far  as  ideas 
are  concerned?”  and  (!>)  "Have  you  nad  to  keep  quiet  or  leave  the 
house  to  have  peace  at  home  ?” 

Part  IV  deals  with  cntoiioua!  stability,  consisting  of  items  142-176. 
These  items  arc  answered  cn  .he  .•anu*  continuum  of  responses  used  in 
part  III.  Sample  items  arc:  (a)  "Are  your  eye?  very  sensitive  to  light?” 
and  (6)  "Do  ideas  run  through  your  head  so  that  you  cannot  sleep?” 

Part  V  consists  of  items  186-218  'b  aling  with  economic  conservatism, 
answered  with  alternatives  ranging  from  (SA)  strongly  agree,  to  (SD) 
strongly  disagree.  Sample  items  are:  (o',  "Private  doctors  should  en¬ 
courage  trends  towards  socialized  medicine,"  and  (fr)  “The  government 
should  take  over  all  large  industries." 

Eor  purpose'  o(  convenience  in  r'ac’  ine  scorirg,  tiierc  arc  gaps  in 
the  numbering  of  items  between  parts  I  and  11,  III  and  IV,  and  IV 
and  V. 

(2)  Administration. — Fony-fivc  minutes  suffice  for  almost  ail  ex" mi¬ 
nces  to  complete  the  test.  Pertinent  directions  are: 

The  following  pages  contain  a  number  cf  sta'cmcnts  about  which  there  is  no  gen¬ 
eral  agreement.  People  differ  in  the  way  they  feel  about  the  statcmcn'.s,  and  there 
are  no  right  or  wrong  answers.  We  arc  trying  to  s«uny  certain  aspects  of  personality 
that  are  important  in  your  adjustment  to  aircrew  training.  You  can  help  us  by 
answering  each  question  honestly  and  thoughtfully.  Happiness  and  satisfying 
achievement  arc  definitely  related  to. your  personal  adjustments,  therefore  any  effort 
to  study  this  aspect  of  your  life  is  worth  your  cooperation. 
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*  *  *  mark  (lit  one  alternative  which  best  expresses  your  feeling  about  the 
statement.  Whenever  possible  let  your  own  personal  experience  determine  your 
answer  *  *  * 

One  practice  problem  is  given  before  beginning  the  test. 

(3)  Scoring. — The  inventory  is  machine- scored  using  the  authors’ 
keys,  with  each  response  being  weighted  from  one  to  five. 

Statistical  results. — Results  are  available  for  338  examinees  who  took 
primary  pilot  training. 

(1)  7  est  reliability. — Reliabilities  were  not  computed  for  this  group. 
The  authors’  corrected  odd-even  reliability  coefficients  arc  as  follows: 
Part  I,  0.84;  part  II,  0.97;  part  III,  0.95;  part  IV,  0.94;  and  part 
V,  0.92. 

(2)  Intcrcorrclations. — The  intercorrelations  of  the  part  scores  are 
given  in  table  23.25.  In  general,  the  degree  of  intercorreHion  is  suffi¬ 
ciently  low  to  insure  some  degree  of  independence  of  the  scores. 


Table  23.25. —  Part  score  intcrcorrclations  for  the  Minnesota  Personality  Scale, 
CE438A ,  based  on  the  scores  of  a  group  of  338  pilots  in  primary  training 


1 

2 

i 

4 

9 

1.  Morale  . . . 

0.44 

0.30 

0.35 

0.S1 

2.  Social  adjustment  . 

0.44 

.  .  * 

.41 

.58 

.21 

3.  Family  relations  . 

.30 

.42 

.64 

.11 

4.  Emotionality  . 

•}5 

.58 

.64 

•  •  • 

.2$ 

5.  Economic  conservatism  . . 

.52 

.27 

.11 

.23 

A* 

a  4» 

(3)  Test  validity. — Validation  data,  based  on  a  sample  of  pilots  in 
primary  training,  using  the  graduation-elimination  criterion,  arc  given 
in  table  23.26.  For  this  sample  a  bi serial  coefficient  of  0.15  is  required 
for  significance  at  the  5  percent  level  and  a  coefficient  of  0.20  for  signifi 
cancc  at  the  1  percent  level. 


Table  23.26. —  Validation  data  using  the  graduation-elimination  criterion  for  ike 
five  categories  of  the  Minnesota  Personality  .Scale,  CE438A,  based  cn  o  tonipis 
of  pilots  in  primary  training* 


Score 

M, 

SP. 

fHl 

•S##1 

Morale  . . . 

Social  adjustment  . 

Family  relations  . 

Emotionality  . 

Economic  conservatism  . 

41.91 

66.10 

45.50 

43.16 

15.52 

41.65 
6**4 
45  22 
43.04 
15.09 

*34 

18.61 

*03 

11.6* 

4.95 

0.02 

-.09 

.02 

.01 

.04 

0.04 

-.10 

.04 

.00 

M 

>N,  =  331.  #,=07*. 

*  Corrected  to  on  unrestricted  tunin'  itindird  dotation  of  100. 


Evaluation. — On  the  basis  of  the  present  validation  study,  it  appears 
that  the  Minnesota  Personality  Scale,  CF438A,  has  no  value  for  pkc- 
dicting  success  in  primary  pilot  training.  The  highest  validity  coefficients 
obtained  were  0.07  for  economic  conservatism  and  —0.10  for  social  ad¬ 
justment.  These  coefficients  do  not  differ  significantly  from  xero. 
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Sliiplev  Personal  Inventory,  Formal  II,  CK601B 

TIi**  purpose  of  this  test  is  to  detect  those  individuals  who  exhibit 
psychoneurotic  or  psychotic  symptoms.  Kx|>eriniental  administra'ion  of 
this  instrument  in  the  Air  Corps  was  accomplished  by  Psychological  Re¬ 
search  L’liit  Xo.  1  and  by  headquarters  of  the  A  Ah'  Training  Command. 

In  this  report,  the  results  of  three  validity  studies  will  be  mentioned. 
The  first  study  was  designed  for  the  purpose  of  validation  of  this  test 
against  the  criterion  of  success  in  primary  pilot  training.  The  second  and 
third  studies  were  conducted  for  the  purpose  of  validating  the  test 
against  the  criterion  of  satisfactory-unsatisfactory  adaptability  ratings 
for  military  aeronautics,  for  a  group  of  aviation  students  and  for  a 
group  of  WASP*. 

Description. — The  scored  items  have  been  divided  by  the  author  into 
20  clusters,  on  a  purely  a  priori  basis,  and  they  were  designed  as  con¬ 
venient  groupings  for  the  psychiatrist,  to  help  in  obtaining  a  qualitative 
picture  of  the  individual.  In  general  use  of  the  test,  primary  concern  is 
with  the  single  quantitative  score,  and  the  clustering  feature  was  de¬ 
signed  primarily  to  appeal  to  the  psyehiatrically  inclined.  Without  enter¬ 
ing  into  a  discussion  of  their  natures,  the  clusters  arc:  Psychopath  A 
and  B,  neurotic  A  and  B,  irresponsible,  inadequate,  social  poise,  sex, 
sociability,  hypochondriasis,  near  psychosis,  gastro-intestinal,  epilepsy- 
dizzy,  family  stability,  family  not  closely  knit,  mood  swing,  school  suc¬ 
cess,  femininity,  job-school  link,  and  miscellaneous.  These  categories 
were  not  considered  in  the  AAF’s  use  of  the  inventory.  They  arc  pre¬ 
sented  merely  as  a  means  of  describing  its  content. 

(1)  Internal  characteristics. — The  inventory  consists  of  145  items,  60 
of  which  arc  scored  to  yield  the  total  number  of  undesirable  responses. 
Fach  item  affords  two  choices,  from  which  the  examinee  is  to  select  i’ne 
one  which  seems  to  apply  better  to  him.  The  choices  are  printed  in  two 
columns,  as  in  the  following  sample: 

L  R 

I  take  life  easy.  -  I  tend  to  worry. 

I  like  to  listen  to  the  radio .  I  prefer  a  hang-up  party. 

I  like  to  stay  put.  _  I’ve  gone  on  the  bum. 

(2)  sldministration. — Since  testing  time  is  not  specified  in  the  direc¬ 
tions  for  the  test,  the  first  group  of  280  tested  was  used  to  standardize 
the  time.  It  was  found  that  approximately  50  percent  of  the  group  com¬ 
pleted  half  of  the  items  in  15  minutes  and  that  80  percent  completed  the 
test  in  30  minutes.  Accordingly,  in  later  administrations,  at  the  end  of 
15  minutes  all  examinees  were  admonished  to  work  more  rapidly.  Thirty 
minutes  were  allowed  lor  completion  of  the  test. 

Pertinent  directions  are : 

In  this  questionnaire  you  are  to  give  information  which  wilt  help  others  under¬ 
stand  you.  You  a>e  to  indicate  certain  things  about  your  job  preferences,  interests, 

rtc 
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In  each  question  you  will  always  have  two  answers  to  choose  between  •  •  • 
the  one  on  the  left  side  of  the  page,  and  the  one  on  the  right.  Choose  the  answer 
which  fits  you  best.  Even  if  neither  fits  you  very  well,  you  must  choose  the  one  that 
fits  you  better  than  the  other  *  •  • 

Remember,  you  must  a!  .vays  choose  one  answer  to  each  question.  Be  sure  not  to 
skip  any  questions.  Work  rapidly. 

(3)  Scoring. — The  test  was  scored  by  the  author's  key,  consisting  of 
60  choices  which  are  considcicd  undesirable.  In  a  preliminary  adminis¬ 
tration  of  the  inventory,  the  author  found  that  an  undesirable  score  of 
18  seemed  to  be  significant  as  a  line  of  demarcation  between  the  abnor¬ 
mal  and  the  normal  group. 

Statistical  results. — The  results  of  three  validity  studies  and  other 
pertinent  statistics,  where  available,  will  be  presented. 

(1)  Test  reliability. — Test  reliabilities  were  not  computed. 

(2)  Test  validity. — In  the  first  study,  a  sample  of  1,419  pilots  origi¬ 
nally  tested  at  Psychological  Research  Unit  No.  1  in  October  1943 
yielded  a  biserial  correlation  of  0.06,  uncorrected  f  )r  restriction  of  range, 
between  “undesirable”  scores  in  this  inventory  and  the  graduation-elimi¬ 
nation  criterion  in  primary  pilot  training.  The  biscrial  cocfficcnt  cor¬ 
rected  to  an  unrestricted  staninc  standard  deviation  of  2.00  is  0.05.  The 
mean  score  for  graduates  is  6.73,  for  eliminecs  6.41,  ar.d  the  standard 
deviation  for  both  combined  is  3.34.  Of  this  sample  81  percent  were 
graduates.  The  obtained  coefficient  is  barely  significant  at  the  5  percent 
level,  but  it  is  in  the  unexpected  direction. 

The  second  study  employed  the  criterion  of  satisfactory-unsatisfactory 
adaptability  ratings  for  military  aeronautics.  A  sample  of  2,107  aviation 
students  tested  at  Psychological  Research  Unit  No.  1  yielded  a  biscrial 
correlation  of  —0.35  between  the  number  of  undesirable  responses  on 
the  test  and  the  criterion  of  satisfactory-unsatisfactory  adaptability  rat¬ 
ing  for  military  aeronautics.  No  distribution  data  arc  available  for  this 
sample.  For  a  sample  of  this  magnitude  a  biscrial  coefficient  of  0.10  is 
significant  at  the  i  percent  level  (the  split  was  0.975-0.025). 

A  comparison  was  made  of  the  mean  scores  on  the  inventory,  for  53 
aviation  students  who  received  unsatisfactory  adaptability  rating  for 
military  aeronautics  scores  and  for  510  who  received  satisfactory  scores. 
The  510  were  members  of  a  random  sample  obtained  from  a  total  group 
of  2,054  tested  at  Psychological  Research  Unit  No.  1.  The  mean  score 
for  the  sample  was  6.71,  with  a  standard  deviation  of  3.49;  the  mean 
for  the  total  group  was  6.77,  with  a  standard  deviation  of  3.49.  The 
mean  score  for  the  group  of  53  cases  was  9.49,  with  a  standard  deviation 
of  4.06.  The  critical  ratio  of  the  difference  between  means  of  the  satis¬ 
factory  and  unsatisfactory  scores  is  4.80,  which  indicates  a  very  signifi¬ 
cant  difference. 

A  third  study  pertains  to  scores  of  194  WASP’s  (classes  44-W-6  and 
4-1-W-7)  on  the  adaptability  rating  for  military  aeronautics,  as  obtained 
by  a  psychiatrist  interviewer,  and  scores  on  a  revised  format  B.  This 
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revision  of  the  test  was  modified  so  as  to  omit  items  that  applied  only 
to  males  and  to  i  new  items  applicable  to  females.  The  ratings 

were  divided  into  four  categories:  ( a )  satisfactory,  (6)  borde  line  sat¬ 
isfactory,  ( c )  borderline  unsatisfactory,  and  ( d )  unsatisfactory.  The 
mean  undesirable  score  on  the  inventory  for  the  entire  WASP  group 
is  8.65. 

With  the  WASP  criterion  data  combined  into  two  categories — two 
satisfactory  subgroups  in  one  and  two  unsatisfactory  subgroups  in  the 
other — the  biserial  coefficient  between  adaptability  rating  for  military 
aeronautics  and  inventory  score  is  —0.36.  This  value  is  close  to  that  ob¬ 
tained  for  aviation  students. 

(3)  Item  validity. — After  dividing  the  aviation  student  sample  of 
answer  sheets  into  two  random  halves,  the  responses  to  the  items  were 
correlated  with  the  graduation-elimination  criterion  from  primary  pilot 
training.  Since  this  test  is  a  two-choice  one,  only  the  A  responses  were 
tallied  in  the  phi  distributions.  The  distributions  of  the  phi  coefficients 
arc  shown  in  tabic  23.27. 


Table  23.27. —  Distribution  of  phis  based  on  item  analysis  for  cross-validation  of 
the  Shipley  Personal  Inventory,  CE601B 


Phi 

f  (erens)* 

/  (odds)* 

OSS  lo  0.19 . . . 

1 

0 

e.to  to  o.t4  . . . . 

9 

4 

0.0$  to  0.W  . 

28 

26 

0.00  to  0.04  . . . 

SI 

SS 

Total  . . . . 

91 

85 

In  interpreting  these  phi  coefficients,  it  can  be  said  that  for  an  N  of 
636,  a  phi  of  0.08  is  significant  at  approximately  the  5  percent  level  of 
confidence,  a  phi  of  0.11  is  significant  at  the  1  percent  level.  In  the  odds 
sample,  11  phis  reached  or  exceeded  the  5  percent  level  of  significance 
with  2  of  these  arching  or  exceeding  the  1  percent  level.  For  the  evens 
sample,  15  phis  reached  or  exceeded  the  5  percent  level;  4  of  these 
reached  or  exceeded  the  1  percent  level.  On  inspection,  it  would  seem 
that  the  number  of  phis  occurring  at  significant  levels  could  have  been 
expected  by  chance. 

Exvluation. — On  the  basis  of  these  data,  the  Shipley  Personal  Inven¬ 
tory,  format  II,  is  not  useful  as  an  instrument  for  the  prediction  of 
graduation-elimination  from  primary  pilot  training.  The  biserial  coeffi¬ 
cient  of  0.06  between  the  criterion  of  graduation-elimination  from  pilot 
primary  training  and  undesirable  responses  to  the  inventory  is  barely 
significant  at  the  5  percent  level.  This  coefficient  is  in  the  opposite  direc¬ 
tion  from  that  expected,  however,  and  could  be  a  chance  deviation  from 
zero. 

The  critical  ratio  of  the  difference  between  the  unsatisfactory  group 
and  the  satisfactory  group  on  the  adaptability  rating  for  military  acro- 
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nautics  is  4.80,  which  indicates  high  significance.  The  biser  ..'•tent 
between  undesirable  responses  to  the  inventory  and  the  criterion  of  un¬ 
satisfactory  adaptability  rating  for  military  aeronautics  is  0.35,  which 
is  far  above  the  level  of  significance  required  at  the  1  percent  level,  but 
which  is  somewhat  questionable  because  of  the  one-sided  split  of  these 
data.  A  similar  correlation  was  obtained  with  a  small  group  of  WASPs, 
in  which  the  split  was  well  balanced. 

Several  objections  that  are  of  interest  arise  in  the  use  of  this  instru¬ 
ment.  It  is  reported  that  some  of  the  items  appear  unsuitable  for  use  in 
the  selection  program,  because  they  present  a  choice  between  an  accept¬ 
able  and  unacceptable  response  (e.  g.,  “I  get  embarrassed  easily” — "I 
seldom  get  embarrassed"),  or  because  both  items  are  socially  unaccept¬ 
able  and  of  the  type  of  “Have  you  stopped  beating  your  wife?”  (c.  g., 
“Our  family  scraps  often  came  after  someone  had  been  drinking” — 
"Drinking  never  was  the  cause  of  our  family  scraps”). 

It  is  reported  that  during  the  administration  of  the  test  there  was  con¬ 
siderable  laughter  and  many  comments  and  questions.  The  comments  of 
the  examinees,  in  general,  centered  about  the  fact  that  some  items  were 
silly,  that  many  seemed  to  be  repeated,  and  that  there  was  difficulty  in 
choosing  between  the  alternatives,  with  considerable  resentment  being 
shown  in  regard  to  being  forced  to  make  a  choice  in  situations  in  which 
the  examinees  denied  having  any  previous  experience. 

Restricted  Word  Association  Test,  CF.702B 

The  purpose  of  this  test  is  to  predict  emotional  shthiiity  during  and 
after  combat  service/ 

Description. — The  idea  of  this  test  is  based  on  the  assumption  that 
words,  or  more  specifically  the  meanings  of  words,  become  associated  in 
conformity  with  the  relationships  that  exist  between  the  individual’s 
affective  attitudes  and  tendencies  to  action  and  the  situations  to  which 
such  subjective  factors  relate.  I  f ,  for  instance,  a  given  individual’s  prin¬ 
cipal  tendency  to  action  with  respect  to  Hitler  would  be  to  attack,  then 
the  stimulus-word  “Hitler”  would  elicit  some  such  response  word  as 
anger,  rage,  or  hate.  If  a  given  individual  is  excessively  self-concerned, 
a  number  of  different  stimulus-words  should  elicit  responses  referring 
to  himself.  If  an  individual  is  generally  negativistic,  he  might  be  cx- 
pected  to  respond  frequently  with  a  word  which  is  the  opposite  of  the 
stimulus-word.  Finally,  if  an  individual  is  oversensitized  to  some  given 
fact,  stimulus-words  denoting  the  fact  or  connoting  characteristics  of 
the  fact  should  reveal  the  examinee’s  sensitivity. 

(1)  Internal  characteristics. — The  test  consists  of  50  items.  Each 
item  consists  of  a  stimulus-word  followed  by  five  alternative  response- 
words.  Thirty  of  the  stimulus  words  were  selected  as  related  to  the  work 
of  air  crew.  These  are  mixed  with  20  from  the  Kent-RosanofT  list.  The 

'  Developed  at  Pyrcholojical  Rcuarcb  Unit  No  I.  Chief  contributor*:  La.  VlrUa  E.  flirt 
and  Capt.  Oonald  t  Supc-. 
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examinee  is  asked  to  indicate  the  response-word  in  each  case  with  which 
in  his  thoughts  and/or  feelings  the  stimulus-word  is  most  strongly  as¬ 
sociated.  A  sample  item  follows: 

In  my  thoughts  and/or  feelings  failure  is  most  strongly  associated  with: 

(a)  Success. 

( b )  War.  . 

(c)  Myself.  C 

(d)  Flying. 

( t )  Fear. 

(2)  Administration. — The  test  is  group-administered.  There  is  no 
time  limit,  sufficient  time  being  allowed  for  all  examinees  to  complete 
the  test.  Pertinent  directions  are: 

This  is  an  information  test  to  sec  how  words  arc  associated  with  other  words  in 
cadets*  thoughts  and  feelings.  You  will  he  given  a  key  word  and  below  it  will  be 
listed  five  other  words.  You  arc  to  select  the  one  of  these  five  words  which  is  most 
strongly  connected  in  your  thoughts  and  feelings  with  the  key  word. 

This  is  not  a  te-l  of  how  well  you  understand  words.  There  arc  no  right  or  wrong 
answers.  Different  people  have  different  association;  between  words,  and  the  inten¬ 
tion  of  this  test  is  to  get  information  about  these  differences. 

The  choices  for  each  of  the  key  words  on  this  test  have  been  selected  in  such  a 
way  that  there  is  no  single  l»est  clioice  or  answer  for  any  key  word.  You  will  do  best 
if  you  do  not  linger  over  any  question,  hut  select  ic  word  which  you  feel,  at  your 
first  impression,  seems  to  go  most  strongly  with  the  key  word. 

(3)  Scoring — Test  scoring  is  accomplished  by  means  of  an  a  priori 
key. 

/•tvlualion. — No  statistical  data  are  available  to  permit  an  evaluation 
of  this  test. 

PREFERENCE  INVENTORIES 

In  this  section  are  considered  those  tests  that  evaluate  interests  and 
prefe teiiccs.  The  general  line  of  approach  is  somewhat  akin  to  the  bio- 
gr:;phi.al-data  approach  (see  ch.  27),  which  has  proved  valuable  in  pre- 
;!!••. Iny  air-crew  success.  Three  of  the  four  tests  are  commercially  pro¬ 
vid'd;  one  was  constructed  in  the  AAF  program. 

T  he  Strong  Vocational  Interest  Plunk  for  .Men,  CE303A 

Fxpcrimrntal  administration  of  this  well-known  questionnaire  was 
undertaken  in  an  effort  to  estimate  the  validity  of  its  largo  number  of 
interest  items,  in  predicting  pilot  success,  with  the  prospect  of  develop¬ 
ment  of  a  scoring  key  for  pilots. 

Ih’scriftion.  -This  commercially-provided  (15)  interest  blank  meas¬ 
ures  the  degree  of  similarity  between  Sue  expressed  interests  of  the  ex¬ 
aminee  and  the  professed  interests  of  lea«k*rs  in  some  38  professions  and 
occupations  for  which  tin-  test  is  scored.  Ratings  arc  made  for  such 
fields  as  artist,  psychologist,  physician,  policeman,  social  science  teacher, 
car|n  nter,  ad  •*  man,  and  for  authors  or  journalists.  Two  non- 

iKVupational  ;  ■  i.'c  made;  «>nc  for  masculinity- femininity  ami  the 
other  for  intcres! -maturity. 
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(1)  Internal  characteristics. — The  blank  consists  of  400  items,  each 
with  three  alternative  responses.  There  are  eight  sect  ions,  which  are 
listed  in  order' 

Part  l.  Occupations.  In  this  section  the  examinee  indicates  whether 
or  not  he  would  like  each  of  some  100  different  occupations,  responding 
with  “like,"  "indifferent,”  or  "dislike.”  Sample  occupations  arc:  astron¬ 
omer,  governor  of  a  State,  and  machinist. 

Part  If.  School  subjects. — The  examinee  indicates,  by  means  of  the 
same  ratings  used  in  section  I,  his  interest  in  various  school  subjects. 
Sample  subjects  are:  bookkeeping,  military  drill,  and  typewriting. 

Part  III.  Amusements. — Using  the  same  ratings  as  in  the  previous 
sections,  the  examinee  records  his  first  impression  to  various  types  of 
amusements,  including  sports,  reading  material,  and  places  to  visit.  Sam¬ 
ples  are:  roughhouse  initiations,  symphony  concerts,  and  Atlantic 
Monthly. 

Part  IV.  Activities. — In  this  section,  the  examinee  again  employs  the 
ratings  of  “like,"  "indifferent,”  or  "dislike,”  to  indicate  his  interest  in 
activities  which  range  from  the  sedentary  to  the  extremely  active.  Sam¬ 
ples  are :  saving  money,  arguments,  and  pursuing  bandits  in  sheriff's  posse. 

Part  V.  Peculiarities  of  people. — In  this  section,  the  examinee  :s  in¬ 
structed  to  record  his  first  impression  of  various  type's  of  people,  using 
the  same  ratings  as  in  the  preceding  sections.  Samples  are:  spendthrifts, 
people  who  talk  about  themselves,  and  people  who  don’t  believe  in  evo¬ 
lution. 

Part  VI.  Order  of  preference  of  activities. — This  section  consists  of 
4  groups  of  10  items  each.  The  examinee  indicates  which  3  of  the  group 
of  10  items  he  likes  most,  and  which  3  he  dislikes  most.  The  remaining 
four  activities  are  checked  "indifferent.”  Samples  from  1  group  of  10 
items  arc:  Operate  (manipulate)  the  new  machine;  discover  an  im¬ 
provement  in  the  design  of  the  machine;  ami  determine  the  cost  of 
operation  of  the  machine. 

Part  VII.  Comparison  of  interest  between  hvo  items. — Pairs  of  items 
are  given  to  which  the  examinee  responds  by  checking  one  or  the  other 
of  the  pair  of  items,  or  if  his  preference  is  equal,  he  may  indicate  that 
choice.  Samples  are:  Chauffeur  or  chef;  Do  a  job  yourself  or  delegate 
job  to  another;  and  Deal  with  things  or  deal  with  people. 

Part  VIII.  Rating  of  present  abilities  anil  characteristics. — This  sec¬ 
tion  is  divided  into  two  parts.  In  the  first,  the  examinee  is  to  indicate 
by  "yes,”  "  or  “no”  whether  or  not  each  of  a  group  of  statements 
applies  to  him.  Samples  are:  Usually  git  other  people  to  do  what  I  want 
done;  Able  to  meet  emergencies  quickly  and  effectively;  and  Show  firm¬ 
ness  without  being  easy. 

In  the  second  jeirt  of  this  section,  the  examinee  is  to  indicate  which 
one  of  groups  of  dirce  statements  applies  best  to  him.  Samples  are: 
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(1)  Tell  joke*  well.  (2)  Seldom  tell  joke*.  (3)  Practically  never 

tell  joke*. 

(I)  Worry  considerably  (2)  Worry  very  little.  (3)  Do  not  worry, 

about  mistakes. 

(2)  Administration.-  - The  lest  was  administered  without  tip'.'  limit, 
between  45  and  00  minutes  being  required  for  all  examinees  to  finish. 

(3)  Scoring. — The  author's  keys  were  not  usetl.  It  was  planned,  in¬ 
stead,  to  develop  an  empirical  kev  for  pilots.  For  purposes  of  cross- 
validation,  answer  sheets  were  split  into  two  groups.  Separate  item 
analyses  were  accomplished  for  the  two  subsamplcs,  and  two  scoring 
keys  were  devised.  The  criteria  for  scoring  a  response  were:  (a)  a  phi 
significant  at  or  beyond  the  5  percent  level  (0.11)  and  (/>)  a  split  of 
85-15  or  belter.  The  evens  group  was  scored  with  the  odds  key,  and  the 
odds  group  was  scored  with  the  evens  key. 

Statistical  results. — Data  are  available  for  a  group  of  approximately 
650  pilots  who  took  primary  training,  who  were  originally  tested  in  June 
1014  at  Psychological  Research  Unit  No.  1. 

(1)  Item  validity. — After  dividing  the  sample  of  answer  sheets  into 
two  random  halves,  the  responses  to  the  items  were  correlated  with  the 
graduation-elimination  criterion  from  primary  pilot  training.  The  distri¬ 
butions  of  the  phi  coefficients  arc  shown  in  table  23.28.  The  plus  are 
based  on  "yes"  or  “like"  responses  only. 


Table  23.28. —  Item  validation  data  for  groups  of  pilots  in  pritnary  training  for  the 
Strong  Vocational  Interest  Blank,  CF.S03A 


PM 


/  (event)' 


0.11  (•  0.22  . 

0.IJ  «•  0.17  . 

0  08  to  0.12  . 

0.0)  lo  0.07  . 

-0  02  lo  0.02  . 

-0  07  to  -0  0)  . . 

-0.12  to  -008  . 

-0.17  to  -0.1) . 

Tout  . 

”  *tf  =  )22L  »  WS)2J 


4 

)t 

72 

ns 

84 

49 

II 


)M 


In  interpreting  these  phi  coefficients,  it  can  be  said  that  for  an  N  of 
322,  a  phi  of  0.11  is  significant  at  approximately  the  5  percent  level  of 
confidence,  and  a  phi  of  0. 1 5  is  significant  at  the  1  percent  level  of  con¬ 
fidence.  In  the  odds  sample  37  phis  (25  positive  and  12  negative)  reached 
or  exceeded  the  5  percent  level  of  significance,  and  9  of  these  (8  posi¬ 
tive  and  I  negative)  reached  or  exceeded  the  1  percent  level  of  signifi¬ 
cance.  In  the  evens  sample,  32  phis  (10  positive  and  22  negative) 
reached  or  exceeded  the  5  percent  level  of  significance,  with  4  of  these 
(4  negative)  reaching  or  exceeding  the  l  percent  level  of  significance. 
There  would  seem  to  be  an  excess  number  of  phis  Inyond  the  confidence 
limits,  but,  in  view  of  the  apparently  unimodal  distribution  with  its  cen¬ 
tral  tendency  at  zero,  it  is  probable  that  there  arc  few.  if  any,  genuinely 
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valid  items  in  this  collection  for  the  prediction  of  primary  pilot  training 
success. 

(2)  Cross-validation  data  -  -Cross  validation,  nevertheless,  was  accom¬ 
plished.  Cross-validation  data  are  presented  in  table  23.29.  Since  only 
one  of  the  keys  shows  a  validity  that  approximates  the  5  percent  level 
of  significance  (and  then  in  the  reverse  direction),  it  is  concluded  that 
the  scoring  of  this  test  with  these  empirical  keys  has  no  value  in  predict¬ 
ing  pilot  success  in  primary  training.  With  the  exception  of  this  one 
key,  the  remaining  biserial  coefficients  do  not  conic  close  to  significance. 


Table  23.29. —  Cross-validation  for  a  group  of  pilots  in  primary  training,  using  a 
graduation-elimination  criterion,  for  the  Strong  Vocational  Interest  Blank,  CF.503A 


Croup 

Kcr 

Score 

M. 

SD, 

Oddi'  . 

25. 2t 
19.69 
5.52 
17.35 
12.04 

5. 31 

27.57 

19.61 

5.96 

17.40 

11.71 

6.19 

6.50 

4.13 

10.1* 

1.9* 

3.29 

S.90 

-0.01 

.01 

-.01 

-.01 

MS 

-.10 

-0  01 
.01 
-.01 
—.04 
.20 
-.11 

Eveiu* . 

Odd* . 

Wrong  *  ..... 

K-W  . 

R-W  . 

'Corrected  to  an  unrestricted  ilanine  standard  deviation  of  2.00. 

,N,  —  S22,  pt~0.10,  number  of  acotcd  items  =  85. 

'^,  =  125,  p#  =  0.70,  number  of  acored  items  =  56. 

4  f  if nificant  at  the  S  percent  level. 

Evaluation. — On  the  basis  of  the  failure  of  the  cross-validation  re¬ 
sults,  it  is  concluded  that  this  test  cannot  be  used  to  predict  success  in 
primary  pilot  training. 

It  should  be  remembered  that  the  attempt  to  derive  an  empirical  key 
for  this  test  was  not  similar  to  that  customarily  followed  by  Strong.  His 
procedure  would  have  compared  experienced  pilots’  responses  with  those 
of  other  combined  occupational  groups.  It  would  seem,  however,  that 
the  procedure  used  in  this  study  was  more  direct  and  should  have  yielded 
positive  results  if  the  items  of  the  Blank  arc  potentially  discriminating 
for  pilot  selection. 

Maller-Clascr  Interest  Values  Inventory,  (!E511A 

This  inventory  was  administered  in  arp  attempt  to  determine  the  valid¬ 
ity  of  personal  values  scores  in  predicting  success  in  Hying  training. 

Description. — Allport  ami  Vernon  (l),  following  Sprangcr’s  classi¬ 
fication  in  ty|>cs  of  men  (Id),  constructed  a  test  to  determine  the  relative 
prominence  of  six  basic  motives  or  evaluative  attitudes  that  govern  men’s 
actions.  These  arc:  theoretical,  economic,  aesthetic,  political,  social,  and 
religious.  The  Maller-GIascr  simplification  of  the  Allport-Vernon  test  is 
aimed  at  measuring  four  types  of  personal  values:  (a)  social,  ( b )  eco¬ 
nomic,  (c)  aesthetic,  and  (d)  theoretical. 

(1)  Internal  characteristics. — The  inven,  ry  consists  of  34  items, 
each  with  four  alternative  responses.  It  is  divided  into  three  parts. 

Part  I  consists  of  items  1  through  10.  Mach  item  consists  of  four 
words.  The  examinee’s  task  is  to  select  the  one  word  from  among  each 
group  of  four  that  pleases  him  most.  A  sample  item  is: 
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(a)  Money. 

( b )  Research. 

(c)  Welfare. 

(d)  Masterpiece. 

Part  H  includes  items  11  through  20  It  is  concerned  with  word  asso¬ 
ciations,  in  which  the  examinee  is  presented  with  a  key  word  and  four 
alternative  responses.  The  task  is  to  select  the  alternative  which  seems 
most  closely  associated  with  the  key  word.  A  sample  is: 

Civilization : 

(«)  Justice. 

(b)  Order. 

(c)  Refinement. 

(<f)  Reason. 

Tart  III  includes  items  21  through  34.  The  section  is  concerned  with 
interests,  in  which  the  examinee  indicates  which  of  the  four  alternatives 
has  most  appeal  for  him.  A  sample  item  is: 

If  you  were  employed  by  a  large  automobile  manufacturing  concern  and  you  had 
the  necessary  ability,  which  of  the  following  positions  would  you  prefer? 

(a)  Handle  the  labor  relations  work. 

( b )  Do  research  work  on  the  development  of  a  better  automobile  engine. 

(c)  Direct  a  new  market  system  for  selling  the  car. 

(<f)  Work  at  improving  the  appearance  of  the  automobile. 

(2)  Administratic  .. — Sufficient  time  was  allowed  so  that  all  could 
complete  the  test.  Pertinent  directions  are: 

This  is  a  lest  of  interests  and  values.  Investigations  have  shown  that  men  vary  in 
the  choices  which  they  make  in  this  test,  and  that  these  differences  affect  success  in 
various  types  of  activities.  There  arc  no  right  or  wrong  answers;  simply  indicate  the 
one  answer  which  appeals  most  to  you. 

(3)  Scoring. — The  authors’  key  was  used  in  scoring.  The  test  is  scored 
for  the  four  values  of  social  (S),  economic  (E),  aesthetic  (A),  and 
theoretical  (T).  One  alternative  for  each  item  is  scored  for  each  value. 
Thus,  there  is  an  S,  K,  A,  and  T  alternative  to  each  item. 

Statistical  results. — Results  are  available  for  524  pilots  in  primary 
training,  originally  tested  in  January  1944  at  Psychological  Research 
Unit  No.  1. 

(1)  Test  reliability. — Reliabilities  were  not  computed. 

(2)  Test  validity. — Validation  results  are  presented  in  table  23.30. 


Table  23.30. —  I'alidation  data  for  pilots  in  primary  training  using  the  graduation - 
elimination  criterion,  for  the  Maller-Glaser  Interest  Values  Inventory,  CE514A * 


Score 

M 

* 

SD, 

fHi 

Social  . 

6.64 

6.80 

3.06 

-0.0J 

-0.09 

Economic . 

13.27 

t2.ll 

4. 55 

».ts 

.03 

Aesthetic  . . 

4.39 

4.94 

3.34 

-.09 

-.06 

Theoretical  . . 

IU  -Ui  M  “ft  AA 

9.64 

10.08 

4.11 

-.06 

.08 

•N,  =  524.  f,=0.S0. 


*  Corrected  to  in  unrestricted  ttinine  itandird  deviation  of  2.00. 

•  Significant  at  the  5  percent  level. 
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The  rather  large  difference  between  the  corrected  and  uncorrcctcd  bi¬ 
serial  coefficients  is  due  to  the  severe  restriction  of  range  in  the  stanine. 
In  this  sample,  the  standard  deviation  of  the  stanine  was  only  0.07. 

(3)  Intercorrelations. — The  intercorrelations  among  the  four  cate¬ 
gories  are  presented  in  table  23.31.  The  universal  negative  correlations 
arc  spurious  and  arc  due  to  the  fact  that  the  selection  of  one  alternative 
automatically  means  rejection  of  others. 

Table  23.31. —  Part-scare  intcrcorrclations  for  the  Maller-Glaser  Interest  Values 


Inventory,  CE514A,  based  on  the  scores  of  524  pilots  in  primary  training 


Score 

S 

E 

A 

T 

S  . 

-0.2J 

-0.27 

-0.2S 

E  . 

—  0.23 

;  a  a 

-.40 

-.61 

A . 

-.27 

-.40 

... 

-.IS 

T . 

—  .2S 

~.6i 

-.IS 

.  .  « 

Evaluation. — Inspection  of  the  hiseriai  coefficients  for  the  Mallcr- 
Glascr  Inventory  reveals  that  only  one  category — economic — had  a  sig¬ 
nificant  validity  coefficient  in  predicting  gradaution-elimination  from 
primary  pilot  training.  The  exact  nature  of  this  relatonsihip  is  dubious 
since  the  corrections  reduced  this  value  nearly  to  zero.  The  three  re¬ 
maining  coefficients  were  low. 

This  inventory  docs  have  certain  advantages  over  certain  other  tests 
of  its  type.  It  is  short,  the  type  of  items  is  diversified,  and  it  has  con¬ 
siderable  interest  for  the  examinee.  It  affords  more  freedom  of  choice, 
in  that  it  provides  four  alternatives,  than  inventories  of  the  yes-and-no 
answer  variety. 

The  structure  of  the  key  may  f>e  open  to  question,  in  that  each  of  the 
alternatives  is  scored  for  one  of  the  four  categories.  Kxaminecs  fre¬ 
quently  have  conflicts  between  two  equally  strong  choices.  It  might  be 
well  to  call  for  two  responses  to  each  item.  This  would  add  to  the  infor¬ 
mation  that  can  be  extracted  from  the  same  items  as  well  as  minimize 
conflicts  and  reduce  negative  intercorrelations  of  scores. 

Kudcr  Preference  Record,  CE515A 

This  preference  blank  was  administered  experimentally  at  Psychologi¬ 
cal  Research  Unit  No.  1  in  May  1944  in  an  attempt  to  validate  its  nine 
interest  scores,  several  of  which  are  not  measured  in  the  other  personal¬ 
ity  inventories. 

Description. — This  test  is  intended  for  use  primarily  in  the  vocational 
and  educational  guidance  of  high-school  atxl  college  students  and  in  em¬ 
ployee  counselling  (11).  Measurements  are  made  in  nine  general  areas, 
which  are  listed  below,  together  with  some  of  the  diverse  activities  which 
the  author  states  arc  included  in  each  area : 

I.  Mechanical. — Civil  engineer,  surgeon,  industrial  designer,  fire¬ 
man,  and  stonemason. 

II.  Computational. — Accountant,  credit  manager,  bond  salesman, 
purchasing  agent,  and  traffic  clerk. 
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I  If.  Scientific. — Archeologist,  oculist,  agronomist,  weather  observer, 
radio  operator,  and  science  teacher. 

IV.  Persuasive. — Advertising  manager,  clergyman,  psychiatrist, 
lawyer,  receptionist,  and  retail  merchant. 

V.  Artistic. — Art  critic,  furniture  designer,  illustrator,  cabinet¬ 
maker,  milliner,  and  photoengraver. 

VI.  Literary. — Copywriter,  writer,  advertising  writer,  reporter, 
English  teacher,  press  agent. 

VII.  Musical. — Composer,  accompanist,  choir  director,  music  teacher, 
singer,  and  sound  engineer. 

VIII.  Social  Service. — Camp  director,  psycholog- 's!,  occupational  thera¬ 
pist,  sales  manager,  and  policeman. 

IX.  Clerical. — Bookkeeper,  billing’  machine  operator,  cost  ac¬ 
countant, and  court  reporter. 

(1)  Internal  characteristics. — This  preference  record  consists  of  168 
items.  Each  item  comprises  a  group  of  three  activities,  from  which  the 
examinee  is  to  indicate  the  activity  he  likes  the  most,  and  that  which  he 
likes  least,  as  in  each  of  the  following  sample  items: 

Sample  t 

Visit  an  art  gallery. 

Drowse  in  a  library. 

Visit  a  museum. 

Sample  II 
Collect  autographs. 

Collect  coins. 

Collect  butterflies. 

(2)  Administration. — The  test  is  administered  without  time  limit.  Ap¬ 
proximately  30  minutes  arc  required  for  the  majority  of  examinees  to 
complete  the  test.  Answers  arc  recorded  on  the  author’s  specially  pre¬ 
pared  answer  sheets,  suitable  for  machine  scoring. 

The  directions  include  specific  comment  on  the  manner  in  which  the 
answer  sheet  is  to  be  accomplished  and  several  sample  problems.  Perti¬ 
nent  directions  arc: 

This  blank  is  used  for  obtaining  a  systematic  record  of  your  preferences  so  that  a 
picture  can  be  obtained  of  how  your  preferences  compare  with  those  of  others  who 
have  answered  the  questions.  The  blank  is  not  a  test  of  ability.  There  arc  no  right  or 
wrong  answers.  An  answer  is  right  only  if  it  is  a  true  expression  of  your  prefer¬ 
ence  •  ♦  •  Some  of  the  activities  named  in  the  following  pages  involve  a  certain 
amount  of  preparation  and  training.  In  such  cases,  make  your  choice  on  the  assump¬ 
tion  that  you  could  first  have  the  training  and  experience  necessary  for  all  the 
activities.  Do  not  choose  an  activity  merely  because  it  is  new  or  unusual.  Make  your 
choices  on  the  basis  of  what  you  would  like  to  do  as  a  regular  thing  if  you  were 
equally  familiar  with  all  the  activities. 

In  some  cases  you  may  find  that  you  like  all  three  activities  in  a  group;  in  other 
cases  you  may  find  all  three  unpleasant.  Please  make  your  choices  for  every  group 
even  though  the  decisions  may  be  hard  to  make. 

614 


(3)  Scoring. — The  scoring  of  the  preference  record  is  accomplished 
by  means  of  the  author’s  keys. 

Statistical  results. — Results  arc  available  for  a  group  of  937  pilots 
who  took  primary  training. 

(1)  Test  reliability. — Reliability  data  are  not  available  for  this  sam¬ 
ple.  The  author  of  the  record  reports  test-retest  reliabilities,  on  a  group 
of  41  graduate  students,  of  0.97,  0.9S,  0.95,  0.96,  0.95,  0.95,  0.93,  and 
0.98,  respectively,  for  the  nine  areas  in  the  order  given  above. 

(2)  Test  validity. — Validity  data  are  presented  in  table  23.32  for  a 
group  of  pilots  in  primary  training. 


Table  23.32. —  Validation  data  for  pilots  in  primary  training,  using  the  graduation- 
elimination  criterion ,  for  the  Kuder  Preference  Record,  CE513A* 


Score 

M. 

SD, 

/»!.* 

Mechanical  . . . 

sens 

85.48 

15.57 

0.02 

0.11 

Computational  . 

3.1.12 

33.39 

9.35 

-.02 

-.62 

Scientific  . 

67.32 

67.88 

12.58 

-.02 

M 

Persuasive .  . 

68.33 

68.47 

I6.S4 

-.01 

-.00 

Artistic  . 

49.71 

47.84 

13.25 

.08 

.14 

Literary  . . . 

46  43 

46.30 

13  29 

.01 

-.01 

Musical  . . 

19  1) 

18.40 

8.96 

.05 

.00 

63  13 

65  67 

14.27 

•—.to 

—  01 

Clerical  ...  . 

46.48 

46,25 

12.10 

.01 

-.9* 

W,=937,  .*,=0.7/. 

*  Corrected  to  an  unrestricted  alanine  standard  deviation  ot  2.00. 
a  Significant  at  the  5  percent  level 


It  is  noteworthy  that  only  one  of  the  biserial  cociTicicnts — that  for 
the  social-science  score — is  significant  at  the  5  percent  level.  The  artistic 
score,  which  was  expected  to  be  low  and  perhaps  negative  on  the  bases 
of  biographical-data  and  sportsand-hobbics  test  results,  approached  sig¬ 
nificance  at  the  5  percent  level.  Surprisingly,  the  mechanical  score,  which 
might  be  expected  to  have  a  high  degree  of  validity  for  predicting  pilot 
success,  had  little  validity  here. 

(3)  Intercorrelations. — The  intercorrelations  among  the  various  in¬ 
terest  scores  arc  presented  in  table  23.33. 

Table  23.33. —  Ini  -rcorrclations  among  part  scores  for  the  Kuder  Preference 


Record,  CP-5  IS  A,  for  a  sample  of  937  classified  pilots 


Interest 

I 

II 

IU 

»V 

! 

V 

VI 

VII 

VIII 

IX 

I.  Mechanical  . 

0.0 1 

0.24 

-0.27 

0.20 

-0.38 

-0.26 

-001 

-Ml 

11.  Computational  ... 

o'.ii 

•  *  * 

.34 

-.11 

-.22 

-.04 

-.07 

-.14 

.51 

III.  Scientific  . 

.34 

•  .  . 

-.31 

-.04 

.17 

—  .23 

-.04 

-.05 

IV.  Persuaaive  . 

-.<7 

-.11 

-.31 

*  >  . 

-.21 

.15 

.01 

.91 

.11 

V.  Artistic  ........ 

.20 

-.22 

-.04 

-.21 

... 

-.14 

-.04 

—  .2J 

-.19 

VI.  Lite? ary  . . 

—  .38 

-.06 

.17 

.15 

-.14 

.  . 

.14 

-.21 

-« 

VII.  Mii'.ical  . 

-.24 

-.07 

-.23 

.01 

-.04 

,14 

»  .  • 

-.14 

VIII.  Social  science  ... 

-.01 

-.14 

-.06 

.01 

-.23 

-  7t 

-.25 

IX,  Clerical  . 

-.07 

.51 

—  .05 

.SI 

\ 

-.05 

.61 

-  .25 

•  •  • 

Upon  inspection  it  may  be  seen  that  there  is  little  or  no  degree  of 
concomitant  variation  between  the  interest  measu.es,  with  the  exception 
of  that  between  clerical  and  computational  scores,  which  seems  to  be  due 
to  an  overlapping  in  the  type  of  questions  in  each  area  The  relationship 
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between  artistic  and  mechanical  scores  is  just  the  reverse  of  what  is  ex- 
|Kvted  on  the  basis  of  biographical-data  results  fsec  eh.  27).  It  is  |H>s- 
sil»lr  that  the  negative  coefficients  are  spuriously  high,  due  to  the  nature  of 
the  pairing  of  the  various  items  throughout  the  test. 

livuluolion.  -On  the  basis  of  the  obtained  biserial  coefficients,  it  is 
found  that  the  Kuder  Preference  Record  has  no  value  in  predicting  suc¬ 
cess  in  primary  pilot  training.  Three  of  the  nine  areas  measured  by  the 
author’s  keys  yield  biserial  coefficients  different  from  those  expected  on 
the  basis  of  other  analyses.  Thus  the  artistic  and  musical  areas,  which 
in  other  studies  have  yielded  negative  validity  coefficients,  in  this  case 
have  yielded  positive  ones.  The  mechanical  area,  which  almost  without 
exception  has  had  considerable  validity  for  predicting  pilot  success,  in 
this  case  was  of  negligible  value.  The  chief  explanation  of  this  variation 
probably  lies  in  the  fact  that  these  questions  sample  appreciation  and 
interest  in  an  area  rather  than  experience,  which  is  sampled  by  valid 
mechanical  tests.  The  commonality  between  mechanical  interest,  as  meas¬ 
ured  by  the  record,  and  mechanical  experience  must,  therefore,  be  very 
low.  'flic  evidence  can  be  taken  to  mean  that  the  mechanical -experience 
factor  is  properly  named;  at  least,  that  it  is  not  an  interest  factor. 

Teacher  Preference  Seale,  CE-126A  1 

This  test  was  developed  for  the  purpose  of  assessing  several  hypothe¬ 
sized  personality  characteristics.  It  was  expected  that  the  examinee 
would  reveal  his  personality  by  indicating  the  type  of  teacher  he  prefers. 
In  using  a  teacher- preference  paired-activity  scale,  it  was  hoped  that  the 
underlying  principle  of  the  test  would  be  hidden  sufficiently  so  that  stu¬ 
dents  would  not  be  able  to  detect  its  true  purpose. 

Description. — The  hypothesized  traits  for  which  the  scale  is  scored 
arc: 

fa)  Excessive  demand  for  definiteness  of  structure  (SD)  :  Individ¬ 
uals  vary  in  the  degree  to  which  they  require  definiteness  in  the  training 
situation.  Sonic  may  tolerate  more  unknown  elements  than  others.  The 
student  with  a  low  tolerance  for  the  uncertain  reacts  to  the  training 
situation  with  hesitancy,  has  difficulty  in  building  and  maintaining  con¬ 
fidence,  and  is  hypersensitive  to  change.  1 1  is  behavior  is  characterized 
by  confusion,  tenseness,  blocking,  and  anxieties. 

(b)  Decision  difficulties  (DD)  :  In  some  cases  the  failure  of  a  stu¬ 
dent  lies  in  his  reactions  to  situations  that  require  him  to  arrive  at  deci¬ 
sions  quickly  and  appropriately.  In  such  situations,  alternatives  must  be 
kept  in  mind,  weighted,  integrated,  and  a  practical  decision  reached.  The 
individual  with  decision  difficulties  may  know  the  various  factors  in- 
\olvcd,  yet  not  arrive  at  decisions  quickly  and  effectively.  The  net  result 
usually  is  undesirable  mechanical  flying. 

(f )  Ego-sensitivity  (ES)  :  This  category  refers  to  the  individual  who 
possesses  strong,  unsa  shod  ego  needs  that  arc  sufficiently  dominant  to 

1  DtiibH  •»  ItMtnk  Uait  XV  }.  CVrf  (Mlrihrtw:  Lt  Jacafc  l  Kmu. 
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interfere  with  his  progress  and  performance.  Included  here  arc  such 
traits  as  hypersensitivity  to  criticism,  extreme  desire  for  independence, 
and  self-consciousness.  The  attention  to  self  tends  to  leave  less  attention 
for  other  things  pertaining  to  flying,  makes  the  person  hesitant  to  secure 
information,  results  in  tension,  hesitancy,  and  worry  about  failure  to  too 
great  an  extent. 

(</)  Social-sensitivity  (SS) :  Somewhat  related  to  ego-sensitivity,  this 
category  of  social-sensitivity  includes  such  traits  as  ovcr-attcntivencss 
to  the  instructor  as  a  person  and  over-dependence  upon  the  instructor. 

(c)  Normal  (N) :  A  normal  key  includes  those  responses  not  keyed 
adversely  for  any  of  the  other  traits. 

(1)  Internal  characteristics. — The  scale  is  divided  into  tv/o  parts.  In 
part  I  the  examinee  indicates  the  type  of  ‘cacher  that  he  would  like  to 
have  as  an  instructor.  Pairs  of  character  it  tics  of  teachers  are  presented, 
and  the  examinee  chooses  the  preferred  characteristic  in  each  item.  In 
part  II  the  examinee  indicates  the  type  of  teacher  that  he  would  not  like 
to  have  as  an  instructor,  again  by  choosing  between  pairs  of  characteris¬ 
tics.  Sample  items  follow : 

Sample  items,  part  I:  (Choose  tyj>e  of  teacher  most  preferred) 

1.  A.  The  teacher  who  is  a  "square  shooter”  and  a  "regular  fellow." 

D.  The  teacher  who  is  able  to  gaiher  and  judge  facts  and  arrive  at  clear 
conclusions. 

2.  A.  The  teacher  who  pays  special  attention  to  the  slow  or  maladjusted  pupil 

B.  The  teacher  who  makes  work  interesting  by  using  examples  and  illustra¬ 
tive  materiaL 

Sample  items,  part  II:  (Choose  type  of  teacher  most  disliked) 

1.  A.  Tbe  teacher  who  is  dishonest 

B.  The  teacher  who  is  changeable  and  inconsistent. 

2.  A.  The  teacher  who  always  hesitates  ami  never  seems  to  make  up  his  niind. 

B.  The  teacher  who  stands  for  a  lot  of  foolishness  and  waste  of  time. 

(2)  Administration. — 'rite  test  is  group-administered,  23  minutes  be¬ 
ing  allowed  for  each  part  of  the  scale.  There  are  SO  items  in  each  part. 
At  the  end  of  10  minutes,  the  examinees  are  told  that  they  should  be 
half-way  through  the  part.  The  time  limit  was  set  so  that  at  least  80 
jx  rcent  of  the  group  would  complete  the  test. 

(3)  Scurintj. — The  two  parts  arc  scored  separately  according  to  the 
ti\e  a  priori  categories :  Normal,  ego  sensitivity,  social  sensitivity,  struc¬ 
ture  demands,  and  decision  difficulty. 

Statistical  results Data  on  internal  consistency  and  score  intcrcorre- 
lations  arc  available  for  pilots  in  primary  training  originally  tested  at 
Psychological  Research  Unit  No.  3  with  validation  data  available  for 
part  I  only. 

( 1 )  Test  validity.  Validity  data  are  available  for  a  group  of  pilot* 
in  primary  training,  using  the  graduation-elimination  ertrrion.  These  re¬ 
sult:.  are  presented  in  table  23. 34. 
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Table  23.34. —  Validation  data  for  a  group  of  pilots  in  primary  training,  using  the 
graduation-elimination  criterion,  for  part  l  of  the  Teacher  Preference  Scale,  CP.426A 1 


Score 

M, 

M. 

St', 

fM» 

.'»«!* 

N  . 

35.61 

36.19 

4.45 

-0.08 

-0.12 

ES  . 

9.02 

8.91 

2.04 

.05 

.04 

SS  . . 

11.81 

12.35 

2.27 

‘-.14 

-.16 

sn  . 

10.74 

10.66 

2.12 

.02 

.04 

I)l>  . 

9.49 

9.71 

1.87 

-.07 

—AS 

•AT,  =  422.  p#=0.80. 

■  Corrected  to  an  unrestricted  stanine  itandard  deviation  of  2.00. 


1  Significant  at  the  5  percent  level. 


Only  the  negative  relationship  between  the  social  sensitivity  score  and 
success  in  primary  pilot  training  is  significant  at  the  5  percent  level. 

(2)  Intcrcorrclaliutts. — Score  intercorrelations  were  computed  for 
part  I,  part  II,  and  between  corresponding  scores  in  parts  I  and  II.  The 
correlations  between  “normal”  scores  and  others  are,  of  course,  spuri¬ 
ous,  Ixcausc  of  the  scoring  method.  Tlu-se  results  are  presented  in 
tables  23.35,  23.36,  and  23.37.  On  the  basis  of  the  intercorrelations  be¬ 
tween  the  two  parts,  a  low  degree  of  conimunality  is  indicated.  Thus,  it 
would  have  been  advisable  to  validate  each  part  of  the  test  rather  than 
to  base  conclusions  on  part  I  as  symptomatic  of  l*>th  parts. 


Table  23.35.—  Intercorrelations  of  scores  for  part  l  of  the  Teacher  Preference 

Seale,  CF.426A* 


Key 

N 

ES 

SS 

SD 

DD 

M  . * 

-0.32 

-0.25 

-0.32 

-0.27 

rs  :::::::: . 

-0.32 

.11 

.03 

.12 

5* . 

-.25 

.ii 

.09 

.13 

k\\  . A, 

-.32 

.03 

.09 

.14 

iip . . . 

-.27 

.12 

.13 

.14 

i  N,  =  4'2  pilot!  in  primary  training,  originally  tested  from  September  13  to  Nov.  2t,  1944. 


Table  23.36. —  Intercorrelations  of  scores  for  part  1 1  of  the  Teacher  Preference 

Scale,  CF.426A1 


Key 

N 

ES 

SS 

SD 

DD 

N  . 

-0.75 

-0.59 

-0.68 

-0.66 

fjS  . 

-0  75 

.45 

.34 

.28 

-.59 

.45 

.  •  . 

19 

.1$ 

SO  . 

-.68 

.34 

.19 

,  .  . 

SO 

on  . 

-.66 

.28 

.15 

.50 

a  .  ■ 

a  =  ;|J  pilot!  in  primary  naming  in  clan  44lt,  originally  toted  in  November  1943. 


Table  23  37.—  Intercorrelations  of  scores  for  part  /  with  those  of  part  U  for 
Teacher  Treferenee  Seale,  CF426A' 


Key 

r*H 

rn 

N  . 

0.39 

0.56 

. 

.47 

SS  .  . 

.IS 

.41 

sn  . 

.27 

.42 

I)D  . 

.  al  "  -  I..  ...  _ a... _ -  .1... 

.35 

.52 

•  Sj-iH  in  pnmjtjf  training  »*  (Uu  44H. 
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Evaluation. — Inspection  of  the  data  for  the  Teacher  Preference  Scale, 
part  I,  indicates  that  only  one  category,  social  sensitivity,  has  any  prom¬ 
ise  of  validity.  The  coefficient  is  negative  in  sign,  indicating  that  students 
with  low  social  sensitivity  scores  have  a  slight  advantage  in  graduation 
from  primary  training.  The  relatively  low  intcrcorrelations  between  the 
scores  on  part  I  and  those  on  part  II  indicate  unacceptably  low  reliabili¬ 
ties.  They  may  indicate  a  functional  difference  between  the  two  ways  of 
phrasing  the  questions. 

Two  criticisms  have  been  made  of  the  rationale  of  this  test.  First,  it 
seems  likely  that,  if  this  test  were  administered  to  examinees  in  any  par¬ 
ticular  phase  of  training,  they  might  have  certain  instructor  or  teacher 
stcreotyj>es  which  would  interfere  with  the  validities  of  their  answers 
for  prediction  in  some  other  stage  of  training.  Secondly,  it  seems  ques¬ 
tionable  whether  an  examinee’s  “ego  needs”  can  be  projected  dearly, 
simply  by  stating  his  preferences  for  teachers. 

Another  objection  lies  in  the  fact  that  the  examinee  is  forced  to  choose 
between  two  alternatives,  without  any  recourse  to  modified  decisions. 
Tims,  lack  of  a  third  alternative,  such  as  "don’t  know”  or  "haven’t  ex¬ 
perienced  this,”  might  well  lower  the  discrimination  required  of  the  ex¬ 
aminee.  Group-test  administrators  report  that,  having  only  two  possibili¬ 
ties  from  which  to  choose,  the  examinee  tends  to  adopt  a  somewhat 
superficial  attitude  toward  his  choices.  Apparently,  a  few  questions  which 
force  the  examinee  to  choose  between  what,  for  him,  arc  situations  be¬ 
yond  the  scope  of  his  information  or  experience,  or  which  do  not  fit 
him,  will  adversely  color  the  nature  of  his  interest  and  effort  toward 
the  other  items  in  the  test. 

This  type  of  scale,  if  carefully  developed,  might  well  serve  as  a  selec¬ 
tive  device  in  indicating  the  type  of  instructor  (ground  school  or  flying) 
needed  by  each  student.  Likewise,  a  complementing  selective  device  ap¬ 
plied  to  instructors  might  well  yield  information  which  could  result  in  a 
matching  of  student  and  instructor,  which  would  yield  an  optimal 
teacher-student  relationship.  The  test  apparently  has  little  promise  as 
a  pilot-selection  instrument. 

SUMMARY  AND  K VALUATION 

This  chapter  has  demonstrated  that  personal  inventories  and  prefer¬ 
ence  inventories  generally  fail  in  the  two  primary  objectives  that  were 
set  for  assessment.  First,  with  two,  or  possibly  three,  exceptions,  none 
was  able  successfully  to  predict  graduation  from  primary  pilot  training  at 
significant  levels  of  confidence.  Second,  item-validation  studies  generally 
failed  to  yield  many  items  with  statistically  significant  validities.  It  may 
be  that  with  a  higher  standard  of  item  selection — at  the  1  percent  level 
of  confidence  or  at  the  5  percent  level  in  Ixith  odds  and  evens  samples — 
better  success  would  have  been  attained.  F.vcn  with  this  highly  restrict¬ 
ing  standard,  a  sufficiently  large  number  of  items  cotdd  be  pooled  from 
several  inventories  to  make  further  study  with  them  profitable. 
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A  very  useful  purpose  has  been  achieved,  however,  in  the  general  fail¬ 
ure  of  these  tests  to  predict  air-crew  training  success.  It  serves  as  con¬ 
firmation  of  the  belief  that  temperament  tests  must  be  constructed 
specifically  for  the  job  intended.  It  is  obvious  from  the  data  obtained 
that  almost  all  these  tests,  not  constructed  for  the  task  at  hand,  fail  to 
validate  successfully  against  a  training  criterion. 

In  the  discussion  of  the  failure  of  this  type  of  test  to  validate  success¬ 
fully  against  a  training  criterion,  it  must  be  remembered  that  they  may 
lx-  of  value  in  predicting  success  or  failure  in  combat  in  terms  of  combat 
neuroses.  The  chief  difficulty  here  is  in  obtaining  actual  combat  valida¬ 
tion  data. 

It  is  possible,  also,  that  even  in  training,  the  graduation-elimination 
criterion  or  some  other  job-proficiency  criterion  is  not  a  suitable  one 
for  the  validation  of  temperament  tests.  Many  a  person  with  poor  tem- 
peramental  traits  may  by  extra  effort  and  under  external  pressure  show 
satisfactory  job  proficiency.  The  criterion  might  better  be  in  the  areas 
of  trainee  satisfactions  and  adjustments  and  of  instructor  and  supervisor 
satisfactions  with  him  as  a  person  with  whom  they  must  ileal.  Such  cri¬ 
teria  might  yield  quite  different  validities  than  those  found  for  tests 
descrilieil  in  this  chapter. 

Criticisms  of  the  tests  have  been  made  both  from  the  standpoint  of  the 
examinee  and  from  that  of  the  administrator.  The  length  of  the  majority 
of  the  tests  is  too  great,  even  if  moderate  valtv  were  received.  Exami¬ 
nees  have  complained  of  actual  boredom  by  the  time  they  have  completed 
some  of  the  longer  inventories;  some  have  insisted  that  there  is  actual 
repetition  of  items. 

The  structuring  of  the  items  is  another  source  of  difficulty.  In  some 
cases,  fear,  pride,  or  shame,  may  cause  an  examinee  to  falsify  his  an¬ 
swers.  Knowledge--  or  in  some  cases  assumed  knowledge,  which  is  often 
faulty- -as  to  what  is  desired  in  the  classification  schema  colors  many 
resjxinses,  at  times  causing  the  examinees  to  attempt  to  outguess  or  out- 
with  tlu*  purpose  of  the  inventory.  The  use  of  inventories  that  provide 
only  “yes”  and  “no"  responses,  without  recourse  to  a  third  category  of 
indecision  of  “  ?,”  has  been  stated  by  examinees  to  be  frustrating  and 
to  affect  their  reactions  to  the  remaining  items  in  the  inventory. 

Two  suggestions  have  been  made  for  further  study: 

Cl)  'IVsts  such  as  the  'Poacher  Preference  Scale,  CT.426A,  if  care¬ 
fully  develnjied,  might  well  serve  as  a  selective  device  in  indicating  the 
type  of  instructor  needed  by  ca»*b  student.  A  complementing  technique 
applied  to  instructors  might  yield  information  which  could  result  in  a 
matching  of  student  and  instructor  which  would  afford  an  optimal 
teacher- student  relationship. 

(2)  A  more  comprehensive  measuring  device  than  that  of  the  indi¬ 
vidual  inventories  v.  ouM  he  afforded  by  combining  tile  scores  from  sev¬ 
eral  complementary  inventories- -such  as  the  (iuilford-Martin  Personnel 
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Inventory,  the  Inventory  of  Factors  S  T  D  C  l\.  and  ti  A  M  I  X  In¬ 
ventory,  In  this  instance,  tin-  scons  tor  the  1.1  factors  might  In*  plotted 
on  a  composite  graph.  In  means  of  which  significant  profile  configurations 
of  traits  would  l>c  revealed.  Validation  of  such  configurations  would 
prove  an  interesting  and  possibly  fruitful  investigation. 
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CHAPTER  TWE/ilTY-FOUR _ _ 

Clinical  Type  Procedures1 


INTRODUCTION 

I  he  introduction  of  clinical  types  of  procedures  in  the  air-crew  clas¬ 
sification  program  constitutes  a  sharp  departure  from  standard  written 
tests  both  in  approach  and  in  techniques.  For  this  reason  it  is  appropriate 
to  state  briefly,  by  way  of  preamble  to  the  procedures  themselves,  the 
general  rationale  underlying  this  use. 

Fairly  specific  aims  were  set  forth  in  the  project  proposal  at  Psycho¬ 
logical  Research  Unit  No.  1  for  experimental  application  of  clinical  pro¬ 
cedures  to  air-crew  classification :  (1)  To  determine  the  prognostic  effi¬ 
ciency  of  various  clinical  procedures,  separately  and  in  combination; 
(2)  to  discover  and  interpret  interrelationships  among  clinical  tech¬ 
niques  and  other  measurements;  (3)  to  provide  leads  for  the  development 
of  tests  that  could  be  empirically  validated;  and  (4)  to  provide  case 
studies  of  the  personal  characteristics  of  examinees  from  which  could 
be  built  up  eventually  a  man  analysis  (in  terms  of  personality  character¬ 
istics)  of  performance  in  pilot  training  ami,  if  possible,  in  combat. 

The  clinical-type  procedures  attempt  to  emphasize  the  interaction 
among  personality  traits  within  the  individual.  Consideration  of  the  role 
of  compensating  or  balancing  factors  is  fundamental  in  these  procedures 
in  evaluating  any  single  characteristic  of  the  individual.  Uccausc  a  global 
approach  is  basic  to  these  techniques,  they  afford  the  possibility  of  in¬ 
vestigating  complex  interrelationships  of  personality  that  are  difficult  to 
ascertain  through  the  use  of  other  methods. 

Many  of  the  tests  to  be  described  in  this  chapter  arc  of  the  type  that 
in  ordinary  clinical  practice  are  individually  administered,  such  as  the 
Rorschach  Psychodlaynoslik  and  the  Thematic  Apperception  Test.  In 
an  attempt  to  adjust  them  to  meet  a  large  testing  load,  however,  either 
the  scoring  or  administration,  or  both,  were  in  some  trials  adapted  to  a 
large-scale  method  of  handling.  Such  devices — as  a  special  scoring  sys¬ 
tem  for  the  individually-administered  Rorschach,  a  group-administration 
method  for  the  Rorschach  and  for  the  Thematic  Apperception  test — rep¬ 
resent  attempts  to  streamline  individual  clinical-type  procedures  to  fit  a 
large-scale  testing  program. 

*  Written  by  S/Sgt.  Arthur  Z.  Cert. 
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Tlit*  (.finical  Techniques  Project 

A  cliniral-procedim  s  group 5  that  sought  to  explore  the  possibilities 
of  utilizing  a  holistic  approach  to  the  subject  of  classification  for  air¬ 
crew  positions  was  established  at  Psychological  Research  I'nit  No.  1. 
At  one  time  the  objective  of  tins  group  was  to  establish  an  entirely  sepa¬ 
rate  staninc,  based  on  the  results  of  clinical  procedures,  which  was  to  be 
validated  against  the  criterion  of  graduation-elimination  from  primary- 
pilot  training.  This  goal  was  never  realized. 

The  complete  battery  established  for  the  clinical-techniques  project 
cuts  across  the  lines  of  several  chapters  in  this  volume.  The  tests  are: 

<  1 )  Projective  techniques : 

(a)  The  Rorschach  Test,  CE701A. 

( b )  Thematic  Apperception  Test,  CF.706A. 

(2)  Observational  techniques: 

(a)  Observational  Stress  Technique,  CE710A. 

(b)  Observation  During  Psvchoinotor  Testing  Rest  Period, 

CE709A. 

(c)  The  Interaction  Test,  CE425A. 

(d)  Observation  of  Atypical  Behavior  During  Psvehomotor 

Testing,  CE708A. 

(c)  Conference  on  Occupational  Background  and  Interpreta¬ 
tion  of  Test  Score,  CE707A. 

(3)  Printed  tests: 

(a)  The  Behavior  Preference  Questionnaire,  CE432A. 

( b )  The  Personal  Audit,  CE431A. 

(c)  Occupational  Experience  Blank,  CE603A. 

(4)  Self-rating  techniques: 

(«)  Indices  of  Self-Confidence,  CE427A. 

Plan  of  Approach 

1  be  content  of  this  chapter  is  in  two  sections.  The  first  is  concerned 
with  projective  techniques  within  the  accepted  definition  of  the  term,  of 
which  probably  the  best  known  are  the  Rorschach  and  the  Thematic  Ap¬ 
perception  tests.  These  projective  instruments  range  from  tests  that  are 
individually  administered  and  scored  through  others,  such  as  the  Empa- 
thetic  Response  test,  that  are  group-administered  and  machine-scored. 

In  the  second  section,  observational  techniques  are  presented.  These 
techniques  involve  the  observation  and  rating  of  the  behavior  of  exami¬ 
nees  under  stress  situations  either  in  the  performance  of  actual  appara¬ 
tus  tests,  such  as  the  observation  of  atypical  behavior  during  usychoiiiotor 
tests,  or  in  tests  specifically  designed  for  observational  procedures  such 
as  the  Observational  Stress  test. 

*  Mnnlm:  1.1.  Aerum.  11.  Ben-Avi,  Sip.  Gerald  S.  Blum,  l.t.  Mason  Hair*,  Lt.  John  S. 
Harding,  l-l.  George  S.  Kirin,  la.  John  \v.  NjrgariL  Sgt.  Harotil  M.  I’rojhanjly.  l.t.  John  \V. 
Roihnrjr,  Sgl.  Leo  Srole,  Sgl.  Bernard  Slrinior,  Capl.  Donald  E.  Super,  Stafl/Scl.  John  L. 
Wallen. 
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Except  where  noted  to  the  contrary,  the  data  that  follow  are  based 
upon  unclassified  aviation  students  tested  at  Psychological  Research 
I'nit  No.  1  during  the  period  May  27  to  June  31,  1943.  Those  who  en¬ 
tered  pilot  training  were  in  classes  44C,  44D,  and  44E. 

PROJECTIVE  METHODS 

Projective  methods  are  intended  principally  as  probing  instruments, 
going  beyond  the  conscious  or  superficial  responses  of  individuals  in 
determining  personality  content  and  structure.  In  this,  they  promise  types 
of  information  regarding  goals,  values,  conflicts,  anxieties,  and  emotional 
complexes  beyond  that  obtainable  by  the  usual  inventory  procedures.  The 
promise,  further,  of  a  total-personality  picture  in  which  is  revealed  not 
simply  isolated  traits  and  attitudes  but  their  interrelationships  in  the 
structure  of  personality  is  one  nut  afforded  by  other  approaches. 

In  method  and  stimulus  material,  the  projective  techniques  depart 
from  the  questionnaire  type  of  personality  test.  The  method  is  founded, 
essentially,  on  the  assumption  that  the  individual  reveals  his  ways  of 
organizing  experiences  when  he  is  presented  with  stimuli  that  are  rela¬ 
tively  poorly  structured  as  to  meaning  and  that  he  himself  organizes 
through  the  projection  of  meanings,  ideas,  and  feelings.  The  projective 
methods  thus  allow  for  greater  freedom  and  spontaneity  in  response  than 
do  standard  ratings  and  questionnaires;  responses  arc  less  biased  by  pre¬ 
arrangement  of  test  items,  as  in  psychometric  tests,  and  do  not  depend 
on  self  diagnosis  or  introspective  ability. 

The  Rorschach  Test,  CK701A 

It  was  thought  that  further  information  on  the  aviation  trainee’s  in¬ 
tellectual,  emotional,  and  motivational  characteristics,  and  the  patterning 
of  these  traits,  beyond  that  revealed  by  printed  tests,  would  reveal  sig¬ 
nificant  information  in  the  prediction  of  air-crew  success.  Accordingly, 
experimentation  with  a  standardized  administration  of  the  Rorschach 
technique  (4)  was  undertaken.* 

Description. — The  Rorschach  method  is  essentially  a  procedure  for 
revealing  the  personality  of  the  individual  as  an  individual,  as  contrasted 
with  rating  or  assessing  him  in  terms  of  his  likeness  or  conformity  to 
social  norms  of  action  and  speech.  It  is  just  Ixcause  a  subject  is  not 
aware  of  what  he  is  telling  and  has  no  cultural  norms  behind  which  to 
hide  himself  that  the  Rorschach  and  other  projective  methods  are  so 
revealing  (1). 

In  contrast  to  this  approach  to  personality  diagnosis,  personality  in* 
\entorie-  and  questionnaire-  attempt  to  establish  more  rigidly  controlled 
and  standardized  situations.  In  the  questionnaire  method,  however,  the 
examiner  is  deprived  of  the  po»ibility  of  understanding  how  and  why 

*  A»!tnmi-tr.»tinn  .in  I  inici  j*rrtatton  w.ti  rxccutrt!  ihirH)  by  the  following  individual*,  mo*t  of 
whom  arc  tin  inhere  of  the  llur-charh  Institute:  ('pi.  Jamet  A.  Oiri«teit»oii#  IVf.  Herman  Feifel, 
S^t,  HaruM  M.  lIiu-1ian?*ky.  Walter  J.  Kei%  Si;t.  Bernard  Stemior,  Jtnd  l.t.  Herbert  J. 
Zuckcr. 
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the  examiner  arrives  at  a  particular  result.  There  is  a  second  important 
disadvantage  in  the  use  of  questionnaires,  in  that  the  examiner  must  rely 
almost  entirely  on  information  which  the  subject  is  willing  and  able  to 
furnish.  The  Rorschach  method  seeks  to  overcome  these  difficulties. 

(1)  Internal  characteristics. — In  the  typical  application  of  the  task, 
the  examinee  is  presented  with  a  standard  scries  of  10  ink-blot  pictures, 
reproduced  on  cards  7  inches  by  9 }/2  inches  (8).  In  five  of  the  blots, 
colored  ink  is  used  in  addition  to  black  ink.  The  examinee  is  asked 
merely  to  report  what  he  sees  In  the  blots.  The  blots  themselves,  while 
symmetrical,  have  little  structure.  It  is  assumed  that  the  examinee,  by 
the  very  ambiguity  of  the, blots,  must  himself  organize  them  in  order  to 
give  them  some  form  or  meaning.  In  the  main  part  of  the  test  the  ex¬ 
aminee  gives  his  free  responses  to  each  blot,  the  examiner  offering  no 
further  encouragement  after  his  initial  instructions,  merely  recording 
verbatim  the  examinee’s  responses. 

The  inquiry  follows  the  main  part  of  the  test.  After  the  examinee  has 
given  his  responses  to  all  of  the  10  cards,  the  examiner  returns  to  each 
card,  reads  the  responses,  and  inquires  into  the  way  in  which  the  re¬ 
sponses  were  formed;  whether,  for  example,  it  was  shading  or  color, 
and  whether  the  whole  or  a  part  of  the  blot  determined  the  response. 
From  an  analysis  of  the  frequencies  of  responses  within  the  various  cate¬ 
gories,  and  through  qualitative  considerations  of  the  types  of  responses, 
a  picture  of  the  personality  of  an  individual  is  constructed  from  the  pro¬ 
jected  material,  including  both  intellectual  and  nonintellcctual  aspects. 

(2)  Administration. — In  the  large-scale  administration  of  the  test,  the 
gathering  of  responses  to  the  ink-blot  cards,  the  scoring  and  the  inter- . 
pretation  of  the  records  were,  in  a  sense,  independent  steps.  The  same 
examiners  did  not,  in  all  cases,  participate  in  all  three  phases  on  the 
same  records.  The  steps  in  the  processing  of  a  record  were: 

(a)  Recording  of  responses  to  the  cards.  Some  preliminary  scoring  was 
done  at  this  time. 

( b )  The  examiner’s  first  clinical  impression  formed  entirely  from  the 

behavior  of  the  examinee  in  the  test  situation.  Solely  on  the 
basis  of  this  over-all  impression,  the  examiner  made  a  clinical 
prediction  of  success  in  primary  pilot  training  (CE701A-I). 
The  prime  consideration  underlying  these  predictions  was  the  ' 
examiner’s  knowledge  of  the  results  of  job  analyses  for  pilots. 

(c)  Initial  scoring  of  the  record  by  the  examiner. 

(d)  Checking  of  the  scoring. 

(e)  Tabulation  and  summary  of  the  scores. 

( /)  Final  interpretation  of  the  record. 

(g)  Clinical  prediction  of  success  or  failure  in  primary  pilot  school 
(CF701A-II),  based  on  scores  and  interpretation  of  the  record. 

One  examiner  tested  one  examinee  at  a  time.  Each  examiner  tested 
approximately  four  aviation  students  during  each  testing  day.  Most  ex- 
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arniners  were  members  of  the  Rorschach  Institute.  A  rigorous  course  in 
interpretation  was  conducted  for  all  examiners. 

A  few  minutes  were  spent  during  the  first  part  of  the  testing  period 
in  informal  conversation  with  the  examinee  to  put  him  at  case.  Then  the 
test  was  begun  with  these  instructions : 

This  is  a  test  of  visualization.  It  is  somewhat  different  from  others  you  have 
taken,  and  you’ll  probably  find  it  a  more  interesting  experience.  It  consists  of  10 
cards  made  up  out  of  ink  blots.  You  knew  you  can  drop  ink  on  a  sheet  of  paper, 
fold  it,  smear  it  over,  and  when  you  open  it,  find  a  "blotto.’'  Different  people  see 
these  cards  in  different  ways.  We  would  like  to  know  what  you  see  in  these  cards. 
There  are  no  right  or  wrong  answers.  You  can  say  as  little  or  as  much  as  you  want 
about  each  card.  Time  docs  not  matter  in  this  test.  When  you’ve  said  all  you  want 
about  a  card,  place  it  face  down  on  the  table,  and  I  will  give  you  the  next  one.  Here 
is  the  first  card  *  •  •  What  might  this  be? 

Upon  completion  of  the  10  cards,  the  examiner  began  the  inquiry  with 
these  instructions : 

That  completes  the  first  part  of  the  test.  I  fere  are  the  cards.  In  order  to  score 
your  responses  adequately,  it  is  important  for  me  to  be  sure  how  and  why  as  well  as 
where  you  saw  the  particular  things  you  did,  so  I  can  see  them  exactly  as  you  did. 

There  were  no  time  limits.  Average  testing  time  ranged  between  75 
and  120  minutes. 

(3)  Scoring. — Two  general  scoring  treatments  were  accorded  the 
data: 

(a)  CE701A-I. — Clinical  predictions  of  success  in  primary  pilot  train¬ 
ing  were  based  on  the  over-all  clinical  impression.  This  rating  was  made 
in  most  cases  immediately  after  the  administration  of  the  test.  Ratings  of 
the  examinees'  self-confidence,  on  a  five-point  scale,  were  also  made  at 
this  time. 

( b )  CE701A-II. — Clinical  predictions  of  success  in  primary  pilot 
training  based  on  interpretations  of  the  Rorschach  responses.  These  rat¬ 
ings  were  made  after  the  records  had  been  stored  and  interpreted. 

It  is  now  in  order  to  list  the  interpretive  significance  of  each  of  the 
various  scoring  categories  or  signs  for  air-crew  training.  This  interpreta¬ 
tive  schema  reflects  as  much  as  i>os.sibIc  flic  opinions  held  in  common  by 
the  examiners  concerning  the  requirements  of  air-crew  duMes.  If  a  cer¬ 
tain  characteristic  of  a  category  score  is  regarded  as  a  positive  or  a  nega¬ 
tive  indicator,  it  should  be  clear  that  this  evaluation  is  made  in  relation 
to  the  prospect  of  success  in  pilot  training;  it  docs  not  represent  a  value 
judgment  of  an  examinee.  It  is  obvious,  also,  that  these  positive  ami 
negative  weightings  do  not  follow  necessarily  from  the  Rorschach  inter¬ 
pretive  procedure.  A  brief  intci  pretation  of  the  categories  as  agreed 
upon  by  the  clinical  procedures  group  follows: 

1.  Inner  life.— -a.  M. — The  absence  or  presence  of  M  (human  move¬ 
ment)  was  given  no  special  significance  except  in  cases  where  many 
M’s  might  compensate  for  unfavorable  features  of  the  record.  The  pres- 
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nice  of  some  human  movement,  two  or  more  M’s,  was  of  positive  in¬ 
terpretive  value  as  a  compensating  factor. 

b.  I'M. — The  lack  of  FM  (animal  movement)  was  viewed  as  a  nega¬ 
tive  indicator  (i.  c.,  associated  with  failure).  A  well-developed  FM  col¬ 
umn,  considered  in  conjunction  with  the  characteristics  of  the  content 
of  the  responses  themselves,  numerous  responses,  add  an  adequate  num¬ 
ber  of  whole  responses  (W’s),  indicates  in  the  Rorschach  schema  an 
adequate  fund  of  energy.  These  characteristics  were  taken  to  mean  in 
the  present  study  that  the  person  had  generally  desirable  drive  and,  spe¬ 
cifically,  a  real  desire  to  fly  in  the  air-crew  situation. 

c.  m. — If  the  number  of  m  (inanimate  movement)  responses  out¬ 
weighed  both  M  and  FM,  it  was  considered  a  poor  sign,  since  it  indi¬ 
cated  intrapersonal  conflict.  A  few  in’s,  coupled  with  the  presence  of 
anxiety  and  an  inability  to  make  an  adequate  adjustment,  indicated  for 
the  interpreters  that  the  individual  was  a  very  poor  prospect  for  pilot 
training. 

2.  Outer  life. — a.  Sum  C. — The  absence  or  marked  suppression  of  sum 
C  (color)  responses  was  given  considerable  negative  weight.  The  CF 
rcs|x>nse  (determined  by  lxith  color  and  form,  but  primarily  color) 
Seemed  to  be  more  characteristic  of  the  group  than  FC  preponderant, 
and  this  reaction  was  presumed  to  be  a  favorable  sign,  even  if  it  was  in 
the  FC  column.  If  the  dominant  color  responses  were  explosive,  such 
as  volcano  or  fire,  the  examinee  revealed  poor  self  control  and  was  con¬ 
sidered  a  poor  prospect.  The  presence  of  FC  responses  alone  was  re¬ 
garded  as  slightly  unfavorable,  since  it  indicated  a  too  careful  individual 
who  was  not  very  spontaneous.  Pure  C  responses,  which  were  very  few 
in  number,  had  to  be  considered  in  conjunction  with  other  factors  before 
any  significance  could  Ik*  attached  to  them.  In  general,  the  gross  reac¬ 
tivity  to  color  as  expressed  in  sum  C  was  regarded  as  the  more  important 
factor.  If  the  individual  evidenced  a  basic  extroverted  pattern  (as  rc- 
veahtl  by  the  fact  that  he  offered  many  responses  to  the  cards  containing 
color)  hut  did  not  utilize  die  color  itself  in  his  responses,  the  absence 
of  color  rrsjxinses  was  regarded  as  esjx-cially  serious. 

.1.  Control — Poor  control  was  assumed  to  be  a  negative  element  in 
prediction  when  F  percent  (the  percentage  of  responses  which  arc  de- 
termined  exclusively  or  primarily  by  form)  was  below  20.  This  could 
Ik*  cntn|N‘tisatcd  for  by  indications  of  control  in  other  responses,  such  as 
good  form  jKTCeption  in  M.  FM,  and  color  aria.  If  the  student  had 
more  than  60  percent  F,  he  was  considered  to  be  constricted.  In  terms 
of  living,  this  meant  presumptively  that  In*  was  unable  to  shift  his  at¬ 
tention  with  enough  rapidity  and  that  he  lack  is  I  flexibility  in  his  approach 
to  >itiialions.  This  eon-t rict ion  was  regarded  as  esjiecially  serious  if  it 
resulted  in  a  repres-ron  of  the  inner  life  (in  die  ease  of  an  individual 
who  was  characteristically  an  introvert)  or  in  a  repression  of  the  outer 
life  (ill  the  case  of  a  ba.icallv  extroverted  person). 
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4.  Sensitivity. — Fc,  c. — The  use  of  texture  (c)  in  responding  was 
considered  a  positive  sign.  It  was  taken  to  indicate  a  degree  of  sensitivity 
and  tact  which  would  stand  the  student  in  g  xjd  stead  in  his  relations 
with  instructors  and  his  fellows.  It  also  'iidic  .icd  a  certain  plasticity  and 
an  affinity  with  objects.  Thus,  an  individual  possessing  these  character¬ 
istics  might  be  able  to  feel  his  way  through  a  maneuver  and  fly  by  the 
seat  of  his  pants.  The  use  of  Fc  (response  concept  of  definite  form  and 
texture)  indicated  a  more  elaborate  way  of  indicating  control.  An  im¬ 
portant  qualification  must  be  stated,  however.  If  Fc  plus  c  outweighed 
the  F  column  or  sum  C  in  an  extroverted  personality,  this  syndrome 
was  negatively  weighted.  In  general,  also  it  was  felt,  as  in  the  case  of  F, 
that  too  high  a  frequency  was  as  unfavorable  as  too  low  a  frequency. 

5.  C. — Only  a  small  proportion  of  cadets  utilized  black,  white,  or 
grey  as  surface  color.  This  was,  therefore,  not  regarded  as  a  very  im¬ 
portant  sign  except  when  the  C  responses  were  more  frequent  than  the 
color  responses,  thereby  indicating  a  tendency  to  avoid  full  responsibility 
or  involvement  with  social  and  emotional  experiences. 

6.  Anxiety. — k  (toned-down  shading  effects),  K  (use  of  diffusion), 
FK. — These  determinants  lid  not  appear  frequently.  Usually  the  ex¬ 
istence  of  anxiety  had  to  be  derived  from  other  aspects  of  the  record. 
But  if  there  were  many  k  and  K  responses,  the  diagnosis  of  anxiety  was 
more  certain.  If  the  ch«-  roscuro  responses  outweighed  the  development 
of  the  inner  life,  M  and  FM,  the  picture  was  regarded  as  unfavorable. 
Presence  of  FK,  indicating  an  introspcctivencss  and  tendency  toward 
self-analysis,  somewhat  diminished  the  seriousness  of  signs  of  anxiety. 
If  FK  outweighed  F,  however,  the  examinee  was  regarded  as  too  self- 
conscious  fo~  purposes  of  successful  performance  in  the  training  situa¬ 
tion. 

7.  Mental  approach — W,  D,  d,  Dd  +  S. — The  normal  percentage  in  the 
use  of  location  (as  defined  by  Klopfcr  and  Kelley  [4])  was  taken  as  an 
important  sign  for  success  in  air-crew  training.  A  well-balanced  and 
clastic  mental  approach  was  believed  to  be  one  of  the  most  desirable 
assets  of  the  trainee.  If  the  individual  had  an  unusually  high  W  percent 
(percentage  of  whole  responses),  this  indicated  too  great  a  preoccupa¬ 
tion  with  a  phantasy  life  or  a  consuming  intellectual  ambition,  and  these 
were  taken  as  unfavorable  signs.  Too  few  W's  indicated  an  inability  to 
integrate  and  organize  a  task  or  situation.  A  low  W  percent  usually  inv 
plied  a  high  D  percent  (percentage  of  responses  utilizing  large  usual  de¬ 
tails).  This  indicated  too  great  an  emphasis  on  everyday  facts  and  de¬ 
tails.  In  general,  however,  this  was  not  assigned  great  importance,  and 
the  absence  of  D  was  considered  significant  only  when  there  was  no  fur¬ 
ther  attempt  at  analysis  within  the  whole  response.  Dd  (unusual  detail) 
+  S  (space),  and  d  (small  usual  detail)  were  also  assigned  a  negative 
weight  when  they  far  exceeded  the  norms  defined  by  Klopfcr  and  Kelley. 

8.  liriebnistyp  ratios. — (a)  M  :  Sum  C;  ( b )  (FM  +  m)  :  (Fc+c4V)  ; 
(c)  (84-9  +  10)  percent.  These  three  ratios  were  mainly  considered  for 
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their  consistency.  It  was  assumed  that  an  extroverted  individual  was  more 
suited  for  pilot  training  than  an  introvert.  Howe’er,  the  introvert-ex- 
travert  tendency  was  not  regarded  as  important  as  'irhether  the  individual 
was  living  out  his  basic  tendencies.  In  other  woids,  the  question  was 
whether  he  was  “being  himself."  Both  an  introver  ivc  tindency  with  no 
M's  or  I**  M's  and  an  extroverted  picture  with  no  colo'  reactions  were 
negative  indicators.  The  general  assumption,  then,  was  t'.at  an  individual 
will  not  function  at  his  optimum  efficiency  if  he  lives  against  his  basic 
tendencies. 

9.  A  percent  ( percentage  of  animal  responses).  —This  was  not  re¬ 
garded  as  a  very  important  factor.  Only  if  A  percent  exceeded  60  per¬ 
cent  and  was  regarded  together  with  other  indicators  of  mental  stereotypy 
(such  as  a  large  number  of  popular  responses  a  id  little  variety  of  re¬ 
sponse)  was  it  allowed  souk*  negative  significance. 

10.  P—O  ( popular  minus  original  responses). — This  was  practically 
never  used  as  a  datum  for  a  prediction. 

11.  Frequency  of  responses  and  refection  of  cards. — The  number  of 
responses  never  received  more  weight  than  the  quality  of  responses.  It 
was  soon  observed,  however,  that  numerous  examinees  gave  only  10  re¬ 
sponses.  This  posed  a  rather  difficult  problem  for  interpretation.  Not 
only  were  the  tabulated  statistics  rendered  less  reliable,  but  the  quantity 
of  material  in  the  record  was  often  very  meager.  Predictions  in  these 
cases  were  consequently  made  without  much  confidence,  and  the  examin¬ 
er’s  clinical  impression  was  relied  upon  more  heavily  than  usual.  In  gen¬ 
eral,  it  was  felt  that  the  more  responses  a  person  gave,  the  better  were 
his  chances  of  passing  training.  If  the  individual  gave  20  or  more  re¬ 
sponses  (R's),  however,  and  yet  revealed  constriction  or  repression  of 
his  functions  to  an  unusual  degree,  his  chances  of  success  were  consid¬ 
ered  to  be  decreased.  There  were  few  individuals  who  rejected  cards 
(failed  to  give  any  response  at  all).  Those  who  could  not  find  an  answer  * 
to  one  or  more  cards  were  considered  as  poor  training  prospects. 

12.  Time. — The  time  taken  by  the  examinee  to  respond  to  a  card  was 
very  limited  as  a  source  of  interpretation.  Only  when  a  student  was  very 
slow  in  responding,  or  when  his  reaction  time  to  any  particular  card  in¬ 
dicated  color  or  shading  shock,  did  the  time  contribute  to  the  interpi'eter’s 
judgment 

Statistical  results.  (1)  Examiner  differences. — The  influence  of  the 
examiner  upon  the  number  of  responses  was  studied.  The  means  and 
standard  deviations  of  numbers  of  responses  arc  presented  by  examiner 
in  table  24.1.  The  critical  ratios  for  differences  among  examiners  are 
given  in  table  24.2.  Twelve  differences  are  significant  at  the  1  percent 
level  of  confidence  and  three  more  at  the  5  percent  level — a  total  number 
that  would  appear  to  1  e  well  above  the  expectation  on  the  assumption  of 
homogeneity  of  examiners. 


Table  24.1. —  Distribution  statistics  on  number  of  responses  obtained  by  nine 
examiners  on  the  Rorschach  Test,  CE70IA 


Examiner  No. 

N 

M 

SD 

Rank  order 

I . . 

75 

22.2 

15.4 

II  . 

49 

17.5 

9.1 

Ill  . 

67 

21.5 

12.1 

IV . 

66 

IS.t 

7.2 

V  . . 

72 

19.9 

10.4 

VI  . 

22 

24.J 

16.0 

VII  . 

44 

14.6 

6.5 

VIII  . 

26 

22.7 

10.9 

IX  . 

66 

20.5 

14.4 

Table  24.2. —  Critical  ratios  between  means  of  responses  for  nssu  examiners 
on  the  Rorschach  Test,  CE70IA 


Examiner  No. 

II 

Ill 

IV 

D 

VI 

vn 

VIII 

IX 

I  . 

•2.54 

0.74 

•1.98 

•4.1 1 
1.45 
•2.72 

1.57 

1.25 

.Bt 

•2.15 

0.21 
•2. 15 
.86 
•2.11 
1.44 

•4.22 

1.66 

•2.84 

.24 

•2.20 

•2.22 

0.21 

1.09 

.42 

>2.22 

1.12 

.45 

•2.24 

.  1.09 
1.22 
.44 
•2.72 
.20 
I.IS 
1.76 
*2.87 

II . 

Ill  . 

IV . 

V . 

VI  . 

VII  . 

VIH  . 

'  Significant  at  tk«  S  percent  Icrci. 
*  Significant  at  the  t  percent  tercL 


(2)  Temporal  differences. — The  influence  of  time  of  examination 
upon  numbers  of  responses  was  also  studied.  The  results  are  presented 
in  table  24.3. 


Table  24.3. —  it  ran  numbers  of  responses  obtained  at  different  times  of  the  day 
for  the  Rorschach  Test,  CE70IAP 


Morning 

Midday 

Afternoon 

Mean  . 

18.1 

19.2 

20.5 

SD  . 

12.6 

10.9 

tu 

• 

*  Number  of  ea»*a=497. 


(3)  Reliability. — Test  reliability  was  not  estimated. 

(4)  Validity  of  clinical  impressions. — Validity  results  arc  available 
against  the  criterion  of  graduation  or  elimination  front  primary  training. 

In  table  24.4  are  shown  the  validity  data  for  the  over-all  clinical  im¬ 
pressions  and  predictions,  nude  immediately  after  the  administration  of 
the  test  (CE701A-I). 

Table  24.4. —  Validation  data  for  clinical  predictions  based  on  Over -all  Clinical 
Impressions  (CE701A-I),  based  upon  the  Rorschach  Test,  CE70IA-1,  for 
groups  of  pilots  in  primary  training,  class  HC,  using  the  graduation - 


elimination  criterion 

N, 

r».. 

292 

0.09 

0.12 

190 

41 

.06 

! 

j 

* 

i 

t 

i 
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Validation  of  clinical  predictions  of  success  in  primary  pilot  training, 
based  upon  interpretations  of  the  scored  Rorschach  records 
fCK701A-II),  however,  was  more  promising.  For  a  group  of  281  pilots 
in  primary  training,  92  percent  of  whom  graduated,  the  biscria!  coeffi¬ 
cient  of  correlation  was  0.23,  which  is  significant  beyond  the  S  percent 
level  of  confidence.  The  corrected  biserial  was  0.26.  Because  of  the 
marked  differences  between  examiners,  however,  it  is  felt  that  these  re¬ 
sults  arc  not  definitive. 

fS)  Validity  of  single  categories. — Individual  scoring  categories,  25 
in  number,  were  validated  for  two  groups  of  pilots  in  primary  training 
in  order  to  evaluate  some  of  the  noninterpretive  or  direct  measures 
yielded  by  the  Rorschach  technique.  Each  record  was  scored  with  respect 
to  the  location,  determinant,  and  content  of  responses,  according  to  the 
system  described  by  Klopfer  and  Kelley  (4).  The  data  are  presented  in 
table  24.5.  For  group  I,  a  biscria]  correlation  of  0.19  is  required  for  sig¬ 
nificance  at  the  5  percent  level  and  of  0.25  at  the  1  percent  level.  For 
group  II,  the  required  coefficients  arc  0.21  and  0.27.  The  biscrial  corre¬ 
lations  arc  seen  to  be  generally  low,  and  those  that  seemed  significant  in 
group  I  proved  to  be  of  doubtful  significance  in  the  revaluation  of  the 
categories  in  group  II. 


Tabu  24.5. —  Validation  data  for  25  tingle  Rorschach  categories  for  the  Rorschach 
Test,  CE70IA,  graduation-elimination  criterion  for  two  samples  of  pilots  in 

primary  training 


CiUtory 

u 

M 

t. 

SD, 

fM» 

i«. 

IP 

1 

II 

I 

II 

i 

II 

R  . 

18.49 

20.90 

15.75 

19.15 

10.94 

12-455 

0.14 

0.08 

T  . . 

718  00 

75800 

670.00 

700.00 

450.00 

44600 

.08 

.07 

T/R  . 

44.77 

41.00 

40.11 

44.00 

22.81 

20.26 

.10 

-.01 

. 

71.12 

22.00 

20.12 

21.00 

11.55 

17.0 

.01 

-.02 

Tt,  . 

27.21 

28.00 

24.69 

24.00 

rt.06 

22.0 

.08 

.14 

W  . 

9.17 

8.80 

7.11 

8.65 

4.09 

17.15 

.24 

.02 

W  prrrrnt  . 

00.16 

51.26 

55.75 

55.1* 

.  27  85 

27.94 

.08 

-.04 

n  . 

7.06 

8.18 

6.72 

6.71 

670 

6.95 

.01 

.11 

O  p»rr*nl  . 

11.67 

14.29 

17.61 

15.0 

20.47 

20.06 

-.15 

-.01 

fW  f  S  p*rcmt  . 

10 

9.19 

6.0 

8.89 

7.99 

10.10 

.11 

.01 

M  . 

1.43 

1.75 

.91 

1.51 

1.78 

1.77 

.15 

.08 

FM  . 

184 

1.89 

1.01 

4.05 

2.72 

2  78 

.15 

-.01 

t . .1 

6.69 

7.42 

5.81 

6.71 

5.91 

7.16 

.08 

.06 

T  fnffm  . 

14.10 

17  70 

15.95 

11.45 

17  04 

1665 

-.05 

.04 

FK  *  F  +  Ire  |»rtrn«  .  . . . 

19.61 

41.76 

40.11 

41.59 

17.17 

1620 

-.02 

.08 

ft  . 

1.41 

1.65 

1.09 

1.70 

1.71 

1.95 

.10 

-.01 

F>6f  . 

1.79 

2.20 

1.50 

2.01 

1  85 

2.15 

.08 

.04 

F'K  *KU  . . 

1  03 

1.42 

1.28 

1.16 

1.14 

1.68 

-.08 

M 

Sum  C  . 

2.16 

2.20 

206 

1.84 

1.50 

1.96 

.04 

.11 

FC  . 

1.40 

1.67 

111 

1.46 

1  55 

1.48 

.09 

.08 

cr  . 

1.51 

1.56 

1.41 

1.16 

1.46 

1.58 

.05 

.14 

1.  *.  10  iwrrffM  . 

16.00 

J1.7I 

17.94 

15.97 

9  05 

698 

-.08 

-.07 

A  prtrfM  . 

40  71 

46  6) 

48.21 

48.68 

16  17 

15  65 

.05 

-.07 

V»nrlf  . 

6  51 

4.97 

6.28 

651 

2.57 

2.47 

.05 

.11 

r  . 

1.74 

1.81 

1.75 

4.16 

1.67 

1.65 

.00 

-.12 

t:  S(  =  290;  p#  =  0.8S.  In  ctuTut 
■Craup  It:  K,  =  I9I;  f#=OIO.  h  cUm  44C. 


(6)  Validity  of  weighted  composite  scores. — An  attempt  was  made  to 
formulate  a  more  readily  employable  procedure  of  evaluating  Rorschach 
records.  Certain  cat.  .ories  were  combined  so  that  a  composite  score 
could  lie  obtained.  Each  category  was  weighted  on  the  basis  of  what  it 
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would  contribute  to  the  classification  battery  and  also  on  the  basis  of 
its  variability.  The  final  formula,  expressed  in  standard  Rorschach 
symbols,  is: 

Composite  score  =-  2  (Dd TS  percent)  +  6  (FM)  +  8.  (W)  —  1.5  (D 
percent)  +  1  (R)  —  1  (8,  9, 10  percent). 

Data  for  a  new  group  of  pilots  in  primary  training  (N,=  156,  pt—0.79 ; 
in  classes  44D  and  44F)  against  a  graduation-elimination  criterion  gave 
a  biscrial  coefficient  of  0.04.  From  this,  it  is  concluded  that  the  Rorschach 
scores  used  in  this  manner  show  no  promise  as  a  predictive  instrument. 

Evaluation 

Several  important  qualifications  arising  from  the  nature  of  the  ex¬ 
perimental  design  limit  conclusions  concerning  the  validity  of  the  Ror¬ 
schach  technique  for  classification  purposes. 

It  is  to  be  remembered  that  the  present  study  did  not  validate  directly 
the  interpretive  schema  of  the  Rorschach  test.  Only  clinical  predictions, 
which  were  assumed  largely  to  reflect  these  interpretations,  were  vali¬ 
dated.  In  this  technique,  as  in  other  clinical  procedures,  the  clinical  pre¬ 
dictions  were  in  part  dependent  on  the  pilot  stereotypes  of  the  examin¬ 
ers;  only  those  qualitative  features  of  the  records  which  appeared  rele¬ 
vant  to  their  assumptions  regarding  pilot  training  were  considered.  In 
other  words,  at  least  to  some  extent,  each  examiner  tended  to  employ 
his  subjectively-derived  system  of  weighting  significant  features  of  the 
records.  It  is  not  to  be  assumed  that  uniformity  existed  among  the  ex¬ 
aminers  with  respect  to  their  biases  and  weighting  of  factors.  Differences 
between  examiners,  significant  at  the  1  percent  level,  have  been  demon¬ 
strated  even  for  such  objective  data  as  number  of  responses  obtained 
from  examinees. 

Since  the  lest  was  originally  intended  for  use  with  case-history  data 
and  other  information,  the  present  study  cannot  be  considered  to  have 
defined  the  effectiveness  of  the  technique  when  used  in  connection  with 
a  lengthier  clinical  procedure.  It  can  be  said  that  the  objective  category 
scores  have  been  validated  against  the  orthodox  training  criterion  for 
pilots.  It  is  hard  to  see  how  subjective  evaluations  based  upon  the  same 
scores  could  yield  much  better  results  against  the  same  criterion. 

It  is  to  be  emphasized  that  the  nature  of  the  test  itself,  in  its  present 
form,  precludes  any  final  answers  concerning  its  validity  for  predicting 
flying  performance.  As  long  as  (1)  the  most  important  datum  (qualita¬ 
tive  interpretation)  yielded  by  this  test  depends,  to  a  critical  degree,  on 
individual  insight,  intuition,  and  skill,  (2)  the  differences  among  exam¬ 
iners  in  these  respects  remain  difficult  to  measure  and  control,  and  (3) 
examiner  skill  remains  difficult  to  communicate,  negative  results  may  al¬ 
ways  be  attributed  to  inadequacies  in  the  examiner  personnel.  Such  be¬ 
ing  the  case,  the  test,  in  its  present  form,  cannot  be  of  practical  use  to 
a  large-scale  classification  program. 


There  remains  a  possibility  that  the  basic  quantitative  data  yielded  by 
the  test  can  Ik-  subjected  to  another  manner  of  analysis  which  would 
reduce  or  eliminate  the  influence  of  the  examiner-difference  variable 
and  would  validate  significantly.  Vet,  it  has  been  seen  that  one  such  type 
of  analysis  (the  validation  of  single  categories  or  of  their  composite)  did 
not  yield  promising  results.  Some  form  of  pattern  analysis,  in  which 
the  interaction  of  the  personality  variables  (as  represented  by  the  scor¬ 
ing  categories)  is  recognized  and  preserved,  may  very  possibly  prove  to 
be  promising.  However,  statistical  procedures  have  thus  far  failed  to 
provide  a  measure  which  would,  in  the  first  place,  encompass  the  most 
significant  features  of  the  qualitative  interpretations,  namely,  the  inter¬ 
relationship  and  interaction  of  personality  factors,  and  in  the  second 
place,  be  capable  of  direct  validation  against  a  training  criterion. 

Croup  Administration  of  the  Rorschach  Test,  CE701B 

In  an  attempt  to  overcome  the  difficulties  inherent  in  the  time-consum¬ 
ing  practice  of  individual  administration  of  the  Rorschach  test,  experi¬ 
mentation  with  forms  suitable  for  group  administration  was  attempted. 
Experimentation  was  undertaken  first  at  Psychological  Research  Unit 

No.  3,  and  subsequently  at  Psychological  Research  Unit  No.  1. 

\ 

Description. — Two  forms  of  the  j  Rorschach  test  suitable  for  group 
administration  were  tried.  The  Picture  Exercises  test  was  developed 
experimentally  at  Psychological  Research  Unit  No.  3.4  No  code  number 
was  assigned  to  this  form.  The  Visualization  Multiple  Choice  test, 
CE701B  (Ilarrowcr-Erickson)  (3),  was  validated  at  Psychological  Re¬ 
search  Unit  No.  1. 

(1)  Internal  characteristics. — The  explanation  of  the  two  forms  will 
be  developed  in  parallel  fashion.  The  basic  apparatus  used  in  the  admin¬ 
istration  of  the  two  forms  was  the  same;  a  projector,  a  screen,  and 
lantern-slide  representations  of  the  standardized  Rorschach  blots. 

The  main  difference  between  the  two  forms  was  in  the  nature  of  the 
responses.  In  the  Picture  Exercises  test  cn'h  examinee  recorded  his  free 
responses  for  each  slide.  As  many  responses  as  the  examinee  made  were 
recorded  for  each  slide.  The  main  reason  for  this  choice  of  technique  was 
that  the  word  responses  supplied  by  Ilarrowcr-Erickson  (2)  were  not 
regarded  as  necessarily  the  most  applicable  to  the  special  sample  of 
young  American  males  undergoing  selection  for  air-crew  training.  By 
allowing  free  responses  in  the  first  administration,  too,  it  was  hoped  to 
develop  lists  of  suitable  alternatives  empirically  for  use  in  later  modifi¬ 
cations  of  the  test 

In  the  Visualization  Multiple  Choice  test,  the  examinee’s  task  was  to 
choose  responses  from  among  the  13  standard  Harrowcr-Erickson  alter¬ 
natives  presented  with  each  slide.  If  two  of  the  r’ternatcs  seemed  to  ap- 

*  Owl  :  U  J.  Xkktrt  Huik  U  W«.  SttrtM. 
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ply,  the  examinee  was  permitted  to  indicate  a  second  choice.  The  alterna¬ 
tives  for  slide  No.  2  afford  an  example  of  the  type  of  choices  presented: 

1.  A  bug  somebody  stepped  on. 

2.  Nothing  at  all. 

3.  Two  scottie  dogs. 

4.  Little  faces  on  the  sides. 

5.  A  bloody  spinal  solumn. 

6.  A  white  top. 

7.  A  bursting  bomb. 

8.  Two  elephants. 

9.  Two  clowns. 

10.  Black  and  red. 

(2)  Administration. — The  Picture  Exercises  test  was  administered  to 
approximately  140  men  at  a  time.  Each  examinee  was  given  a  preparer! 
answer  sheet  on  which  to  record  his  free  responses.  After  detailed  in¬ 
structions,  the  room  was  darkened  for  30  seconds,  and  a  Rorschach  blot 
was  projected  on  the  screen.  Then  lights  were  turned  on  so  that  there 
was  sufficient  light  by  which  to  record  responses,  yet  leaving  the  blot 
dimly  visible  on  the  screen.  Total  testing  time  was  30  minutes.  In  the  in¬ 
structions  the  examinees  were  told  that  there  were  no  right  or  wrong 
answers ;  that  they  were  merely  to  write  down  all  that  they  saw  in  each 
slide.  The  time  limits  and  manner  of  presentation  of  the  slides  (30  sec¬ 
onds  for  study;  60  seconds  for  recording  responses)  were  explained. 
The  responses  were  written  in  long  hand;  accordingly,  machine  scoring 
of  this  form  was  not  possible. 

The  Visualization  Multiple  Choice  test  (Ilarrower-Erickson)  was 
administered  to  approximately  80  aviation  students  at  a  time.  This  form 
employed  an  answer  sheet  suitable  for  machine  scoring.  In  the  instruc¬ 
tions  the  examinee  was  informed  of  the  general  nature  of  the  procedure 
to  be  followed  and  was  instructed  specifically  in  the  use  of  the  answer 
sheet. 

First,  you  will  take  a  good  look  at  each  picture  as  it  is  shown  and  see  whether  it, 
or  any  part  of  it.  reminds  you  of  anything  or  resembles  something  you  have  seen. 
Then  you  will  read  through  a  list  of  suggested  replies  to  see  which  of  these  is  the 
best  description  of  the  blot. 

Thirty  seconds  were  allowed  for  the  study  of  each  blot,  and  then  30 
seconds  for  the  recording  of  responses.  During  the  period  when  re¬ 
sponses  were  being  recorded,  the  blot  was  still  dimly  visible  on  the 
screen.  Total  testing  time,  including  administration,  was  approximately 
15  minutes. 

(3)  Scoring — Because  the  Picture  Exercises  test  is  a  new  technique 
for  administering  the  Rorschach  ink  blots,  an  arbitrary  list  of  categories 
was  adopted  to  which  responses  were  assigned.  The  list  follows: 

A.  Human. 

B.  Human  anatomy. 

C.  Man  made  object*. 

1.  Food. 

2.  Animal  anatomy  and  man  made  object*. 

3.  Map*. 
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D  Animah. 

E.  Animal  anatomy  and  animal  detail. 

F.  Mythological  and  cartooa 

G.  Marine 

H.  Nature. 

I.  Microscopic. 

J.  X-ray  pictures. 

K.  In  complex  iesponses,  only  the  first  mentioned  objects  are  classified. 

L.  Original.  In  this  classification  system,  means  any  unclassified  response. 


In  this  forn  all  responses  were  recorded  on  cards  according  to  cate¬ 
gory.  Statistical  treatment  was  undertaken  only  with  first  responses. 
Popular  responses  were  determined  by  frequency  counts. 

The  Visualization  Multiple  Choice  test  was  scored  by  six  trained  Ror¬ 
schach  examiners,  using  the  Harrower-Erickson  method  of  scoring  (2). 
The  number  of  normal  responses  to  each  card,  as  defined  by  Klopfer 
and  Kelley  (4),  is  the  principal  category  developed  for  statistical  treat¬ 
ment. 

Statistical  results. — Because  of  the  involved  nature  of  the  scoring 
process,  only  limited  data  are  available. 

(1)  Test  reliability. — No  reliability  data  are  available  for  the  Picture 
Exercises  test.  Data  are  available,  however,  for  the  Visualization  Multiple 
Choice  test.  The  product-moment  correlation  of  normal  responses  between 
the  five  odd-numbered  cards  and  the  total  10  was  found  to  be  0.85.  Cor¬ 
recting  for  overlapping,*  the  odd-even  reliability  was  estimated  to  be 
0.42.  From  these  results  it  is  obvious  that  the  test  does  not  have  accept¬ 
able  odd-even  reliability.  This  is  due  to  the  small  number  of  cards  and, 
probably,  also  to  the  fact  that  the  cards  in  the  test  do  not  appear  to  be 
of  equal  value  for  eliciting  normal  v.  abnormal  responses. 

(2)  Test  validity. — The  Picture  Exercises  test  was  validated  on  a 
sample  of  591  pilots  in  primary  training,  in  class  44D.  The  results  for 
the  six  most  predictive  categories,  as  defined  by  Klopfer  and  Kelley  (4), 
arc  presented  in  table  24.6. 

The  Visualization  Multiple  Choice  test  was  validated  against  the  cri¬ 
terion  of  graduation-elimination  for  811  pilots  in  primary  training. 
Validities  determined  on  the  basis  of  first  and  second  choice,  normal  and 
abnormal  responses,  and  of  total  number  cf  second  choices  made  were 
uniformly  low  (—0.14  to  0.06). 

Evaluation.— On  the  basis  of  the  data  obtained  for  group  administra¬ 
tion  of  the  Rorschach  test,  it  is  apparent  that  the  multiple-choice  form 
(Harrower-Erickson)  would  not  contribute  significantly  to  the  predic¬ 
tion  of  graduation  or  elimination  from  primary  pilot  training.  In  the 
frec-response  form,  the  popular  responses  gave  some  promise,  but  this  re¬ 
sult  needs  verification  with  a  larger  sample. 


*  By  mean*  of  tbe  formula  for  correlation  of  a  part  vritb  tbe  remainder  of  a  wbolo,  Tbe 
formula  employed  »u. 

„  _ _ 

it 


ia  erblcb  «a*odda  score,  «®*rtni  (core,  nod  fa toul  #e#w. 
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Table  24.6,— Validation  of  the  six  most  predictive  categories  of  the  Picture  Exer¬ 
cises  test,  using  a  graduation-elimination  criterion  for  0  sample  of  pilots  in  primary 

training * 


Category 

M 

M, 

SD, 

.rM.' 

Popular  responses  . 

Percent  animal  responses  . 

Rejections  . . 

Total  number  responses . . . 

Movement  responses  . 

Human  responses  . 

4.21 

4J.90 

.65 

1J.56 

2.25 

1.84 

3.62 

41.60 

.96 

15.98 

2.24 

1.96 

1.32 

15.60 

1.06 

5.13 

1.37 

1.22 

•0.21 

*.U 

*-.14 

-.04 

.00 

-.03 

1  llo 
Sp'o'o— id 

•fir, =591,  pt-0 .92. 

'Assuming  an  unrestricted  stanine  standard  deviation  of  2.00. 

*  Significant  at  the  1  percent  level. 

*  Significant  at  the  S  percent  level. 

It  is  interesting  to  note  that  most  of  the  examinees  were  not  aware  of 
the  purpose  of  either  test.  At  the  conclusion  of  testing,  one  group  of  ex¬ 
aminees  was  asked  to  volunteer  opinions  as  to  the'  purpose  of  the  test. 
Some  of  the  responses  were:  “A  test  for  detecting  camouflage;  a  test 
to  determine  visual  acuity;  a  color-blindness  test;  a  map-reavling  test; 
a  test  cf  imagination.”  Discussion  among  some  of  the  examinees  re¬ 
vealed  that  the  majority  tended  to  choose  only  the  most  acceptable  social 
responses,  as  fear  of  consequences  (elimination  from  training)  in  the 
Army  situation  made  a  completely  free  choice  almost  impossible.  Refer¬ 
ence  to  sex  organs  as  one  alternative  provoked  considerable  hilarity,  but 
even  though  this  alternative  might  have  been  readily  discerned  in  the 
ink  blot,  the  majority  of  examinees  refrained  from  making  such  a 
choice. 

The  Thematic  Apperception  Test,  CK706A 

It  was  hypothesized  that  the  Thematic  Apperception  test  promised 
types  of  information  regarding  the  relationship  of  goals,  values,  con¬ 
flicts  and  anxieties,  and  emotional  complexes  to  air-crcw  success  beyond 
that  obtainable  by  the  inventory  procedures.  Accordingly,  studies  were 
undertaken  with  a  form  of  the  test  that  is  suitable  for  group  adminis¬ 
tration.* 

Description. — The  present  test  is  an  adaptation  cf  the  Thematic  Ap¬ 
perception  test,  developed  at  the  Harvard  Psychological  Clinic  (5).  It 
is  a  technique  that  is  purported  to  reveal  to  the  trained  interpreter  indices 
of  the  dominant  drives,  emotions,  attitudes,  and  behavior  patterns  of  a 
personality.  In  clinical  practice  it  is  claimed  to  have  demonstrated  par¬ 
ticular  value  by  its  power  to  uncover  underlying  tendencies  which  the 
examinee  is  either  unwilling  to  expose  or  unable  to  expose  because  he 
is  unconscious  of  them.  The  procedure  of  the  test  is  to  present  to  the 
examinee  a  scries  of  pictures,  each  portraying  one  or  more  human  beings 
who  can  be  variously  interpreted  as  to  their  characters  and  situations. 

•  Developed  at  Psychological  Research  Unit  No  I,  Chief  contributor*  to  th«  imended  Korinff 
technique;  Li.  John  S.  Hording,  CpI.  Charles  £.  Orbach,  arid  Sttft/S gt  John  L.  Wallen. 
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The  examinee’s  task  is  to  tell  a  story  about  each  picture  in  which  he 
should  answer  such  general  questions  as: 

What  has  happened  to  the  individuals  in  the  picture? 

What  are  their  present  thoughts  and  feelings? 

What  will  be  the  outcome  of  the  story? 

The  second  phase  of  the  test  consists  of  an  interview  in  which  the 
examinee  is  probed  for  his  associations  and  memories  in  connection  with 
various  elements  in  his  stories.  Specifically,  the  test  furnishes  data  con¬ 
cerning  (1)  the  type  of  individual  adjustment  to  different  areas  of  life 
situations  (for  example,  family,  heterosexual,  and  authority  relations), 
and  (2)  attitudes,  motives,  and  emotions  accompanying  these  adjust¬ 
ments.  Thus,  the  test  samples  both  covert  and  overt  areas  of  personality. 

(1)  Internal  characteristics. — For  group  testing,  the  pictures  were 
transferred  to  slides  and  projected  on  a  screen.  The  12  slides  consisted 
of  the  Harvard  Psychological  Clinic  Thematic  Apperception  test  pictures 
(5)  of  which  the  following  examples  are  typical: 

(a)  Several  male  figures  in  work  clothes,  reclining  on  grass,  and  ap¬ 
parently  dozing. 

(b)  A  farm  scene,  adult  male  and  adult  female  against  background  of 
tilled  fields.  A  girl  with  books  in  her  arms  in  foreground. 

(c)  An  elderly  woman  with  back  turned  to  a  young  man  holding  a 
hat  in  his  hands. 

(2)  Administration. — This  test  was  administered  to  28  men  in  a 
group,  with  administration  time  of  approximately  75  minutes.  The  in¬ 
structions  read  to  the  group  are  as  follows: 

This  is  a  test  of  creative  imagination.  You  are  going  to  compose  stories,  but  do 
not  bother  about  how  well  your  stories  arc  written ;  you  will  not  be  graded  on  spell¬ 
ing,  phrasing,  or  style.  Literary  ability  is  unimportant.  This. js  not  an  intelligence  or 
a  judgment  test.  Your  stories  will  not  be  scored  as  right  or  wrong.  This  test  is 
merely  to  find  out  how  good  your  imagination  is  when  you  are  pressed  for  time. 

Here  is  what  you  arc  to  do.  You  will  be  shown  some  pictures.  You  are  to  make  up 
a  story  to  go  with  each  picture,  that  is,  the  ideas  for  your  story  should  come  from 
the  picture.  You  can  make  up  anything  you  please ;  you  arc  to  use  your  imagination 
freely,  but  you  must  tell  a  complete  story  from  beginning  to  end  for  each  picture. 
As  you  w.il  have  only  6  minutes  to  write  each  story,  be  careful  that  you  do  not 
spend  your  time  merely  describing  the  picture.  Be  sure  to  give  a  complete  plot 

Remember  that  you  will  have  only  6  minutes  for  writing  each  story.  Plan  to  use 
all  your  time  in  order  to  make  your  stories  sufficiently  detailed.  To  help  plan  your 
time  adequately,  you  will  be  told  when  there  arc  3  minutes  left,  and  again  when  only 
1  minute  remains. 

Under  the  group  administration  of  the  test  it  was  not  possible  to  con¬ 
duct  the  usual  interview  to  elicit  associations  and  memories  in  connection 
with  the  stories.  Instead,  the  slides  were  shown  again  for  5  seconds  each, 
and  the  examinees  were  asked  to  indicate  their  reaction  to  each  picture 
in  terms  of  a  3-point  scale  of  pleasantness.  These  reactions  were  recorded 
on  a  standard  IBM  answer  sheet.  Column  A  was  marked  for  a  pleasant 
reaction,  B  for  indifferent,  and  C  for  unpleasant, 
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(3)  Scoring. — As  planned,  the  story  analysis  would  involve  the  fol¬ 
lowing  steps: 

(a)  Scoring  each  story  for  each  of  38  traits  as  listed  by  Murray  (5). 

(b)  Making  notes  on  the  qualitative  elements  of  each  story. 

(c)  Combining  the  results  of  (a)  and  ( b )  for  the  entire  series  of 
pictures  in  order  to  define  the  major  tendencies  of  the  examinee's  per¬ 
sonality. 

(d)  Making  an  estimate,  on  the  basis  of  these  results,  of  the  exami¬ 
nee’s  chances  for  success,  on  the  nine-point  scale,  in  elementary  pilot 
training. 

Experience  proved  that  the  method  of  scoring  usin~  38  categories  was 
too  expensive  and  unwieldy.  It  seemed  that  the  factors  were  too  refined 
and  subtle  to  be  assessed  from  only  12  written  stories.  Accordingly,  a 
revised  scoring  system  was  prepared,  using  only  the  20  traits  from 
Murray’s  list  that  could  be  isolated  more  easily  in  the  data. 

Six  interpreters  dealt  with  the  material.  The  noncommissioned  officer 
in  charge  had  been  trained  in  this  test  in  the  Harvard  Psychological 
Clinic.  An  attempt  to  obtain  uniformity  of  interpretation  of  the  records 
was  made  by  having  introductory  training  sessions  for  the  interpreters, 
plus  supervision  of  scoring  by  the  noncommissioned  officer  in  charge. 

When  the  scored  and  qualitative  data  were  completed,  the  interpreter 
wrote  a  report  synthesizing  the  examinee’s  personality  tendencies.  With 
these  tendencies  formulated,  the  interpreter  next  considered  their  prob¬ 
able  composite  contribution  to  success  or  failure  in  primary  training  in 
terms  of  a  prediction  on  a  nine-point  scale.  To  facilitate  the  formation 
of  such  a  composite  judgment  from  so  many  variables,  the  interpreter, 
upon  completing  the  scoring,  classified  each  of  the  20  personality  traits 
of  the  examinee  among  four  categories  (strong,  normal,  mixed,  weak) 
on  the  criterion  of  the  probable  contribution  of  each  trait  to  success  or 
failure  in  elementary  training. 

To  the  extent  that  the  traits  tended  to  cluster  in  the  normal  and  strong 
categories,  a  higher  prediction  of  success  was  given,  and  to  the  extent 
that  the  traits  tended  to  cluster  in  the  weak  and  mixed  categories,  a  lower 
prediction  was  made.  It  is  important  to  stress  that  the  clinical  predic¬ 
tion  was  not  based  upon  the  frequency  results  alone,  but  included  non- 
scorable  qualitative  aspects  of  the  examinee’s  stories.  Hence,  the  final 
predictive  estimate  was  in  no  sense  derived  from  calculations  approxi¬ 
mating  a  crude  formula.  Rather,  the  rating  was  based  on  a  broad  clinical- 
type  judgment  in  which  were  synthesized  all  relevant  qualitative  aspects, 
including  those  which  could  be  systematically  stated  in  terms  of  fre¬ 
quencies  as  well  as  those  which  could  not. 
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The  20  traits  employed  in  the  later  analyses  are: 

A.  Ego  image: 

1.  Sex  identification. 

2.  Age  identification. 

3.  Action  initiative. 

4.  Adequacy. 

5.  Endings, 

6.  Goal  orientation. 

7.  Super  ego. 

B.  Emotional  pattern: 

1.  Emotional  strength  and  control. 

2.  Frustration  tolerance. 

3.  Aggression. 

4.  Anxieties. 

5.  Emotional  maturity. 

6.  Picture  tone. 

7.  Examinee’s  test  orientation. 

C  Social  adjustment: 

1.  Orientation  type,  in  terms  of  autonomy. 

2.  To  father  and  authority  figures. 

3.  To  motier  figures. 

4.  To  young  adult  females. 

5.  To  young  adult  mates. 

&  To  the  Army. 

Statistical  results.*- Data  are  treated  in  two  groups.  The  first  follows 
the  system  of  scoring  according  to  38  categories,  and  the  second  follows 
the  revised  scoring  according  to  20  categories. 

(1)  Test  reliability. — Reliability  data  are  not  available.  The  method 
of  obtaining  such  reliabilities  probably  should  be  by  means  of  a  test- 
retest  or  alternate-forms  procedure. 

(2)  Test  validity. — The  validity  of  the  predictions  of  success  (gradu¬ 
ation-elimination)  in  primary  pilot  training,  based  on  interpretation  of 
the  data  using  38  factors,  is  indicated  by  a  biserial  r  of  —0.05  for  a 
group  of  293  pilots  (/>,= 0.89).  This  coefficient  is  not  significantly  differ¬ 
ent  from  zero.1 

The  correlations  between  the  predictions  based  on  38  traits  and  the 
predictions  for  each  of  three  other  clinical  techniques  are  given  in 
table  24.7. 


Table  24.7. —  Correlations  of  clinical  predictions  based  on  three  other  clinical 
techniques  with  the  predictions  for  the  Thematic  Apperception  lest  based 

on  3i  Traits 


Technique 

Jf 

r 

Roraehach  Interpretation  (CE701A)  . . 

320 

0.04 

Interview  (CF.76IA)  . 

323 

.12 

Observational  Slreaa  Teat  (CE7I0A) . . . . 

314 

.11 

’  The  aource  tram  which  the»e  data  wcr«  obtained  failed  tc  five  information  concerning 
U,  and  Ur 
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On  the  basis  of  these  findings  it  is  concluded  that  ratings  based  on 
38  traits  were  unsuccessful  in  predicting  performance  of  pilots  in  ele¬ 
mentary  training.  Also  these  ratings  are  unrelated  to  ratings  based  upon 
other  clinical  techniques. 

The  validity  of  the  predictions  based  on  interpretation  of  the  data 
using  20  traits  was  also  determined.  For  a  group  of  191  pilots  in  pri¬ 
mary  training,  using  a  graduation-elimination  criterion  (^#=0.81),  the 
biscrial  r  was  0.05,  which  is  not  statistically  significant. 

The  correlations  between  the  clinical  predictions  based  on  the  20-trait 
interpretation  of  the  Thematic  Apperception  fst  and  upon  other  ctinical 
techniques  arc  presented  in  table  24.8. 


Table  24.8. —  Correlaliont  of  clinical  predictions  i  uei  on  several  clinical 
techniques  with  the  predictions  for  the  Tkemah  •  Apperception  test 
based  on  20  Traits 


Technique 

N 

r 

Observation  during  PMT  Real  Period,  CE709A  ..... 

I7S 

-0M 

Personal  Audit,  CE4J1A  . . . . 

190 

-JOt 

Interaction  Test,  CE42SA  . . 

IBS 

.0* 

Observational  Stress  Test,  CE710A  . 

119 

.17 

Interview,  CE707A  . 

190 

.19 

Rorschach,  CE701A  . . . 

190 

jOS 

Three  of  the  20  personality  traits,  deemed  to  be  basic  to  the  interpre¬ 
tations,  were  validated  separately.  First,  the  total  number  of  favorable 
or  plus  variations  in  each  trait  was  correlated  with  the  criterion,  then 
the  number  of  unfavorable  or  minus  variations  in  each  trait  was  simi¬ 
larly  correlated,  and  finally  the  difference  between  the  number  of  pluses 
and  minuses  in  each  trait  was  correlated  with  the  criterion.  These  results 
are  presented  in  table  24.9. 


Table  24.9. —  Validity  data  for  ratings  based  upon  three  personality  traits  of  the 
Thematic  Apperception  Test,*  CE706A 


Factor 

Plus 

Minus 

DUferroco 

-0.0* 

.00 

.0J 

-0.09 

.04 

-.11 

-0.01 

-.02 

M 

Emotional  strcn(th  and  control  . 

•  N,  =  186,  st~0M. 


On  the  basis  of  these  data  it  is  concluded  that  ratings  based  on  20 
traits  likewise  have  no  value  in  predicting  success  in  primary  pilot  train¬ 
ing.  Neither  do  they  correlate  substantially  with  other  predictions  of  suc¬ 
cess  in  pilot  training  derived  from  other  personality  evaluations.  Three 
special  indicators  of  personality  likewise  failed  to  show  pilot  validity. 

Evaluation. — From  these  results  it  is  dear  that  the  group  administra¬ 
tion  of  the  Thematic  Apperception  test  as  applied  cannot  be  used  to 
predict  success  in  primary  pilot  training.  It  is  possible  that  the  Thematic 
Apperception  test  actually  measures  personality  adequately,  but  pilot 
aptitude  may  strongly  overshadow  the  importance  of  personality  factor* 
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in  elementary  training.  In  other  words,  another  criterion  in  which  tem¬ 
perament  plays  a  larger  role,  such  as  combat  performance,  might  yield  a 
higher  validity.  If  temperament  is  a  significant  aspect  of  success  or  fail¬ 
ure  in  pilot  training,  however,  it  would  seem  that  anything  as  searching 
and  global  as  the  thematic  test  should  predict  the  pass-fail  variable. 

Considerable  difficulty  is  encountered  in  scoring  this  instrument.  Be¬ 
cause  of  the  nature  ol  the  test,  it  is  obvious  that  each  record  must  be 
individually  scored  andt  interpreted.  This  interpretation  required  as  much 
as  2  hours'  time  per  case.  Thus  it  is  obvious  that  the  test  is  not  economi¬ 
cal  in  its  present  form. 

Probably  one  of  the  most  important  considerations,  however,  is  the 
difficulty  of  interpretation  itself.  The  bulk  of  the  intcrpretational  diffi¬ 
culties  arose  from  three  factors:  (1)  Examiner  inexperience,  (2)  the 
subjective  nature  of  the  scoring,  and  (3)  lack  of  secondary  criteria.  The 
nature  of  the  first  two  difficulties  is  rather  obvious.  The  difficulty  arising 
from  the  lack  of  secondary  criteria  generally  centered  about  the  problem 
of  discriminating  as  to  whether  an  clement  was  a  projective  or  an  intro- 
jeetive  manifestation  of  personality.  More  specifically,  were  the  stories 
the  true  projections  of  the  examinee's  personality,  that  is,  drawn  from 
his  own  personality,  or  no  less  likely,  were  the  stories  the  wishful  crea¬ 
tions  in  the  examinee’s  fantasy  of  what  he  would  like  to  be  and  is  not? 
In  other  words,  the  same  type  of  hero  in  the  stories  of  two  different 
examinees  admits  of  two  different  and  opposite  interpretations.  Qcarly, 
secondary  criteria  are  needed. 

The  Rapid  Projection  Test,  CE711C 

It  is  taken  as  axiomatic  that  even  the  most  normal  person,  if  subjected 
to  sufficient  stress,  will  exhibit  some  personality  disturbance.  Some,  how¬ 
ever,  are  more  susceptible  than  others  to  this.  The  term,  combat  neurosis, 
is  generally  taken  to  imply  predisposing  personality  weaknesses,  or 
proneness  to  break-down  under  stress.  This-. projection  test  was  con¬ 
structed  in  an  effort  to  obtain  an  estimate  of  susceptibility  to  combat 
neurosis.* 

Description. — It  wa*  considered  that  some  22  characteristics  seemed 
basic  in  forecasting  combat  neurosis : 

1.  Poor  family  adjustment. 

Z  Dependence. 

3.  Insecurity  (uncontrolled). 

4.  Overcompensation. 

5.  Lack  of  group  identification. 

6.  Inability  to  externalize  hostile  reactions. 

7.  Civilian  functional  somatic  complaints. 

8.  Poor  job  adjustment 

9.  Weak  ego. 

10.  Schizoid  tendv.  ’cies  (paranoid). 

•Developed  at  PaTcholofical  Rftcarch  Unit  No.  t.  Chief  conn  butora:  Pvt.  Kenneth  A. 
Fisher,  Si«.  Harold  M.  Prothantky,  and  CapL  Donald  E.  Super. 
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11.  Obsessive-compulsive  tendencies; 

12.  Inability  to  take-it  physically. 

13.  Lack  of  belief  in  democracy. 

14.  Lack  of  belief  in  support  on  the  home  front 

15.  Lack  of  belief  in  worthwhilencss  of  efforts. 

16.  Lack  of  belief  in  leaders. 

17.  Lack  of  advance  awareness  of  combat 

18.  Lack  of  interest  in  flying. 

19.  Uncontrolled  prestige  drive. 

20.  Inability  to  become  absorbed  in  a  technical  job. 

21.  Inability  to  become  detached  from  a  situation. 

22.  Lack  of  conviction  of  personal  invulnerability. 

It  was  thought  that  this  instrument  might  elicit  information  on  some 
of  these  reaction  patterns.  Of  these  patterns,  it  was  hoped  that  some 
would  be  critical  indices  of  personality  instability. 

Two  forms  of  the  test  preceded  this  one,  gradually  converging  from 
a  general  exploratory  treatment  of  the  technique  to  a  specific  multiple- 
choicc-type  test  in  CE71 1C.  In  CE711A  a  relatively  large  number  of 
pictures  were  taken  from  magazines  or  were  sketched  and  made  into 
lantern  slides  for  group  administration.  The  examinees  were  asked  to 
write  what  they  thought  of  each  picture.  This  was  intended  to  obtain 
protocols  for  each  picture,  so  as  to  indicate  the  best  multiple-choice  items 
for  each  picture  and  to  eliminate  poor  pictures.  After  a  short  period  of 
experimentation,  this  form,  having  shown  no  indication  of  any  positive 
results,  was  abandoned. 

Form  CE711B  also  served  for  the  collection  of  preliminary  data.  It 
consists  of  43  pictures  with  3  questions  concerning  the  content  of  each 
picture.  The  scries  of  pictures  is  shown  in  sequence  3  times,  with  a 
5-scconds  exposure  of  each  picture  and  l  minute  in  which  to  write  no 
more  than  a  single  sentence  answer  to  each  question.  The  scries  is  shown 
to  permit  answering  of  the  first  question  for  each  picture,  then  the  scries 
is  repeated  to  permit  answering  the  second  question,  and  again  for  the 
third.  For  example,  one  picture  shows  a  soldier  lying  on  a  beach  alone, 
close  to  the  water.  The  three  questions  arc  (a)  "What  is  this  soldier’s 
nationality?”;  ( b )  "Where  arc  this  soldier’s  companions?”;  and  (c) 
"Why  is  he  lying  on  the  ground?”  No  statistical  treatment  was  accorded 
these  results. 

Form  CE711C  is  derived  from  a  Rapid  Projection  Test  developed  at 
the  Harvard  Psychological  Oinic  as  part  of  an  experimental  battery  for 
the  selection  of  combat  officers  (6). 

(1)  Internal  characteristics. — The  material  consists  of  20  of  the 
original  24  Murray  Rapid  Projection  Slides.  Four  pictures  were  elimi¬ 
nated  on  an  a  priori  basis,  because  they  were  very  similar  to  other  pic¬ 
tures.  The  slides  are  made  from  photographs  of  single  male  figures  or, 
in  a  few  cases,  of  groups  of  men. 

Two  series  of  answers  arc  used  for  each  slide,  and  answers  are  re¬ 
corded  on  the  standard  A— O  (15-choicc)  type  IBM  answer  sheet.  The 
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first  series  consists  of  five  possible  interpretations  of  the  content  of  the 
slide,  any  one  of  which  will  answer  the  question  "What  happened?" 
These  are  arranged  in  groups  of  five  for  each  slide,  picture  one  being 
1  — S;  two,  6—10;  and  three,  11  —  IS,  etc. 

The  second  scries  consists  of  IS  names  or  descriptions  of  feelings  or 
emotions,  any  one  of  which  will  be  an  answer  to  the  question  "How  is 
he  feeling?”  These  are  listed  with  letters  from  A  to  O.  The  examinee 
answers  both  questions  with  a  single  black  mark  on  the  answer  sheet. 

(2)  Administration. — The  test  is  group  administered.  It  is  presented 
to  the  examinees  as  a  judgment  test.  They  are  told  that  it  measures  one's 
ability  to  size  up  persons  at  a  glance.  It  is  implied  that  the  pictures  rep¬ 
resent  actual  situations  and  that  in  each  case  one  of  the  answers  is  cor¬ 
rect.  Each  picture  is  projected  in  a  darkened  room  for  6  seconds.  Then 
1  minute  is  allowed  to  record  answers  with  room  lights  on  and  the  picture 
only  dimly  visible. 

The  following  is  the  series  of  questions  for  a  picture  showing  an 
elderly  man  sitting  on  a  bench  which  appears  to  be  on  a  ship.  His  chin 
is  resting  on  his  hands. 

What  has  happened? 

11.  He  has  lost  his  job. 

12.  His  wife  has  died. 

1  Jl  His  son  was  lost  while  on  Atlantic  convoy  duty. 

14.  A  splendid  opportunity  has  just  been  offered  him. 

15.  He  has  just  learned  that  he  has  an  incurable  disease. 

Feeling  choices: 

A.  Daring,  cocky. 

B.  Scornful,  contemptuous. 

C  Depressed,  sad. 

D.  Pleased  with  himself, 

E.  Anxious. 

F.  Terrified. 

G.  Pained. 

H.  Angry. 

I.  Submissive  or  resigned. 

J.  Frustrated,  or  blocked. 

K.  Confused,  hesitant. 

L.  Overjoyed. 

M.  Tired 

N.  Amused 

O.  Humiliated 

Statistical  results. — Only  item-validity  data  arc  available  for  this  test. 

(1)  Item  validation. — Cross-validation  data  were  secured  for  sam¬ 
ple  of  360  graduates  and  196  climinces  from  primary  pilot  training  in 
classes  44C,  44D,  and  44E.  The  sample  Was  split  into  odds  and  evens 
groups.  Because  75  response  spaces  are  available  for  each  picture,  the 
number  of  individuals  selecting  any  one  response  is  too  small  to  permit 
computing  a  reliable  item  statistic.  Two  separate  item  analyses  were 
made,  therefore,  one  treating  only  judgments  of  factual  content,  and  the 
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other,  only  judgments  of  emotional  content.  The  data  are  presented  in 
tables  24.10  and  24.11. 


Tablz  24.10. —  Frequency  distributions  of  phi  coefficients  for  the  criterion  of 
graduation- elimination  from  primary  pilot  training ,  for  the  Rapid  Projection 
Test,  CE711A,  based  upon  judgments  of  emotional  content 
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Tabu  24.11. —  Frequency  distributions  of  phi  coefficients  for  the  criterion  of 
graduation-elimination  from  primary  pilot  training,  for  the  Rapid  Projection 
Test,  CE71JA,  based  upon  judgments  of  emotional  content 
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In  interpreting  these  data,  it  should  be  noted  that  for  an  N  of  278  a 
phi  coefficient  of  0.12  is  significant  at  the  S  percent  level  of  confidence, 
and  a  phi  of  0.16  at  the  1  percent  level  of  confidence.  In  the  item  analy¬ 
sis  based  upon  judgments  of  factual  content,  for  the  evens  group,  only 
eight  responses  were  significant  at  or  beyond  the  5  percent  level.  For 
the  odds  group,  there  were  seven  such  responses. 

For  the  item  analyses  based  upon  judgments  of  emotional  content, 
there  were  11  and  15  responses  for  the  evens  and  odds  groups  respec¬ 
tively,  that  yielded  phi  coefficients  significant  at  or  beyond  the  5  percent 
level. 

The  responses  showing  significance  for  one  subsample  did  not  show 
significance  in  the  other  subsample. 

Etvluaiion. — It  must  be  kept  in  mind  that  this  instrument  was  de¬ 
signed  to  predict  combat  neurosis,  and  not  merely  for  air  crew  selection 
purposes.  Whether  the  predictions  were  to  have  stemmed  from  a  total 
score  or  from  an  over-all  impression  is  not  clear. 

In  cither  case,  it  is  anticipated  that  this  test  would  not  have  a  high 
validity  for  predicting  combat  neurosis.  The  difficulty  with  this  form  of 
the  test,  according  to  Murray,  is  that  no  satisfactory  multiple-choice  an¬ 
swers  have  been  developed  to  measure  the  variables  thought  to  be  essen¬ 
tial  for  predicting  combat  neurosis. 

The  item  validation  results  indicate  that  this  test  has  no  validity  for 
the  primary  pilot  training  criterion. 
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Kmpathetic  Response  Test,  CE715A  * 


Tin*  rationale  put  forward  by  the  constructor  of  this  test  is  that  a  sat¬ 
isfactory  measure  of  the  examinee's  predisposition  to  combat  neurosis 
may  l>c  obtained  from  the  manner  in  which  he  “empathizes”  inf  >  a  fic¬ 
tional  character’s  affective  state.  The  term  empathy  is  used  here  to  mean 
the  imaginative  projection  of  one's  self  into  the  mental  state  of  another 
person. 

Description. — The  examinee  reads  a  short  story,  written  in  r.uch  a 
way  that  he  must  empathize  into  the  affective  state  of  the  chief  character 
in  order  to  supply  a  conclusion.  for  the  story.  Each  story  contains  a  num¬ 
ber  of  inconclusive  clues  or  leads  indicating  a  variety  of  possible  endings, 
but  no  specific  ending  is  supplied.  They  arc  written  in  a  subjective  style; 
that  is,  each  is  written  as  the  conversations,  musings,  ""  reminiscences 
of  the  chief  character  in  the  story.  The  stories  deal  v  ith  various  aspects 
of  military  life  which  arc  familiar,  interesting,  and  significant  to  the  typi¬ 
cal  soldier,  but  arc  combined  into  novel  puzzle  situations.  It  was  hoped 
that  the  tendency  would  be  for  the  examinee  to  supply  r.  conclusion  to 
the  story  compatible  with  his  previously  established  and  predisposing, 
affective  habit  patterns. 

After  the  reading  of  the  stories,  two  classes  of  questions  arc  asked: 
Gass  I  questions  immediately  follow  each  story  and  concern  only  that 
story.  They  do  not  suggest  any  specific  story  ending,  so  as  to  avoid 
prejudicing  the  examinee.  Gass  II  questions  arc  all  listed  at  the  end  of 
the  test  booklet,  and  they  inquire  as  to  the  specific  endings  given  the 
stories  by  the  examinee.  An  example  story  follows: 

Al  last,  ihe  hospital  I  They  If  do  something  socn  to  ease  the  pain.  That  morphine 
shot  Joe  gave  me  didn't  last  long.  My  foot  •  •  •  leg  •  •  •  they  hurt  like  hell,  but  not 
much  damage.  My  hand.  A  mess  I  With  a  bun  ^nd  mustard  I  could  have  a  ham¬ 
burger.  Joe  shook  his  head  when  he  looked  at  it.  Does  that  mean  •  •  •  ?  God, 

what  would  Mary  think?  Ah,  hell!  Joe  is  smart,  but  he’s  no  doctor.  Could  he  tell? 
That  doc's  coming  now  •  •  •  what  will  he  think?  My  foot  and  leg  *  •  • 
just  a  glance.  Good,  1  thought  they  were  O  K.  My  haodl  He's  gentle  as  a  woman 

with  it.  Docs  that  mean  it  is  bad?  I  hope  that  he's  not  the  doc  they  call  i I  rksaw. 

lie  is  calling  that  other  doc  over  to  look.  Why?  Now  they’re  going  away.  They 
keep  looking  over  here.  Why  don't  they  quit  mumbling?  One  keeps  shaking  his 
head.  Does  that  mean  he  does  or  doesn't  want  to?  Now  they  seem  to  agree. 

This  operating  table  feels  swell  •  •  •  but  it  must  mean  an  operation!  No 
use  asking  the  doctor  anything  because  he  won't  give  a  straight  answer.  Probably 
ether  (or  me  soon.  Yeah,  here  it  conics.  Ether  i  Docs  that  mean  a  serious  operation? 
How  serious?  Why  did  that  thing  have  to  explode  and  get  me  in  the  hand?  This 
eihcr  •  •  •  it  stinks.  My  cars  are  buzzing.  Head's  swelling  •  •  •  throb¬ 
bing  *  •  *  swelling  •  •  •  swell  •  •  • 

Head  •  •  •  foggy.  Can't  think  straight.  I’m  out  r.f  the  operating  room,  in 

a  ward.  Operation's  over?  Must  be!  Operation  •  •  •  oper  •  •  •  hand! 

What  did  they  do  to  it?  Can't  see  my  hand.  Must  be  under  the  blanket  My  arm  is. 
I*  •  •  •  is  my  1  .nd?  No  feeling  in  it  Should  there  be?  Could  there  be?  Ill 
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•  if  I  get  guts 


pull  it  out  and  look  I'll  look  when  I  get  gutc  enough 
enough.  Guts  enough  ?  I'll  look  now.  Right  now.  Now  I 

God! 

Two  sample  class  l  <|ueslions  follow: 

The  soldier  feels  that  the  operating  doctor: 

A.  Did  not  listen  to  the  reasonable  advice  of  the  second  doctor. 

B.  Felt  unable  to  make  a  decision  alone. 

C.  Performed  the  type  of  operation  obviously  needed  under  the  circum¬ 

stances. 

D.  Was  chiefly  interested  in  getting  the  operation  done  and  over. 

E.  Was  unconcerned  about  the  result  of  the  operation. 

Choose  the  word  or  phrase  which,  you  feel,  best  describes  the  general  attitude  of 
the  soldier  before  the  operation. 

A.  Fatalistic. 

B.  Frantic  with  worry. 

C.  Completely  unconcerned. 

D.  Deeply  concerned. 

E.  Emotionally  detached. 

The  class  II  question  for  this  story  is: 

In  the  story  of  the  soldier  wounded  in  the  hand,  I  feel  the  operation  ended  in: 

A.  Loss  of  the  whole  hand. 

B.  Loss  of  part  of  the  hand. 

C.  Restoration  of  the  hand  to  partial  usefulness. 

D.  Restoration  of  the  hand  to  full  usefulness. 

E.  A  result  concerning  which  I  cannot  reach  a  conclusion. 

The  test  consists  of  seven  stories,  each  followed  by  five  questions  of 
the  class  I  variety  with  five  alternatives  each.  The  class  II  questions, 
which  were  administered  as  a  group,  consist  of  one  question  for  each 
story.  Each  question  had  five  alternatives. 

(1)  Administration. — Forty  minutes  were  allowed  for  the  test,  two 
forms  of  which  were  validated.  Parts  I  and  II  arc  class  I  ami  II  ques¬ 
tions  for  the  first  form,  while  parts  III  and  IV  arc  similar  sections  of  a 
comparable  second  form  of  the  test.  Pertinent  administrative  directions 
arc: 

The  purpose  of  this  test  is  to  measure  your  ability  to  understand  a  complete 
situation  when  only  part  of  the  facts  concerning  it  are  known  to  you.  It  is  not  a 
test  of  your  ability  to  analyze  and  to  draw  scientifically  correct  deductions.  This 
test  measures  your  ability  to  gra'p  quickly  a  complex  situation  from  a  minimum  of 
given  facts.  Your  knowledge  of  human  nature  and  your  ability  to  understand  how 
and  why  people  act  as  they  do  in  perplexing  situations  is  of  great  importance  in  this 
test. 

•  •  •  you  will  read  a  series  of  short  episodes  and  answer  questions  concern¬ 
ing  the  people  and  events  in  the  cpiMxlc  Read  each  episode  carefully  but  rapidly. 
When  you  have  finished  reading  the  episode,  turn  to  the  following  page  and  answer 
the  questions  concerning  the  episode.  Once  you  have  turned  to  the  page  of  questions 
do  not  turn  back  to  the  previous  page  *  *  *  . 

(2)  Scoring — A  total  of  eight  scoring  keys  was  constructed  for  eight 
different  categories  of  responses.  Since  no  suitable  external  criterion 
for  the  selection  of  items  was  available,  the  items  were  selected  and  the 
provisional  keys  designed  solely  on  the  basis  of  a  priori  a  ssumptions. 
The  provisional  keys,  together  with  the  scoring  formulas  used  for  each. 


647 


arc:  (1)  General  anxiety  (number  of  Rights);  (2)  repression  (number 
of  Rights);  (3)  belief  in  worthwhile. ’less  of  effort  (R— W-HO);  (4) 
fear  of  death  or  injury  (R  —  W) ;  (5)  sex  conflict  (R— W+10);  {(>) 
social  adjustment  (R--W+ 10)  ;  (7)  evasion-confusion  (number  of 
Rights);  and  (8)  attitude  toward  authority  (R— W  +  10). 

Statistical  results. — Data  arc  available  for  two  somewhat  overlapping 
groups  of  pilots  who  took  primary  training. 

(1)  Test  validity. — Validation  data  arc  available  for  two  samples  of 
pilots,  using  the  graduation-elimination  criterion  from  primary  training. 
The  results  of  one  group  are  based  on  parts  I  and  II  of  the  test,  while 
those  of  the  second  group  are  based  on  the  com.  rable  parts  III  and  IV. 
These  data  arc  presented  in  table  24.12. 


Table  24.12. —  Validation  data  for  two  samples  of  pilots  who  took  primary 
training ,  using  the  graduation-elimination  criterion,  for  the  Empathetic 
Response  Test,  CE715A 
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.07 
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Sex  conflict  . . 
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2.17 

-08 

-.10 

Social  adjustment  . . 

9.24 

9.54 

2.05 

-.09 

-.09 

"Krasion-confusion”  . 

S.26 

5.19 

4.22 

.01 

.02 

Attitude  toward  authority  . 

10.85 

11.07 

2.27 

-.06 

-.07 

Ill  and  IV® . 

General  anxiety  . .  ..... 

10.50 

10.71 

2.73 

-.05 

-.07 

Repression  . . 

6.33 

6.18 

2.27 

.04 

.04 

Worthwhileness  of  effort  . 

13.49 

13.26 

3.03 

.OS 

.04 

rear  of  death  or  injury  . 

8.03 

7.82 

3.14 

.04 

.02 

S-x  conflict  . 

11.34 

11.35 

1.06 

-.01 

.00 

oriat  adjustment  ............. 

12.72 

12.71 

2.54 

.00 

.02 

'Evasion-confusion'*  . 

2.78 

2.69 

2.32 

.02 

.01 

Attitude  toward  urhority  ...... 

14.32 

14.31 

1.90 

.00 

-.01 

1  Corrected  to  in  unrestricted  statin©  standard  deviation  of  2.00. 
•For  this  sample  1V(  =  4 91,  £,  —  0.68. 

*For  this  sample  /V,  =  S37,  £,=0.67. 


Tor  parts  I  and  II  a  biserial  coefficient  of  0.12  is  required  for  signifi¬ 
cance  at  the  5  percent  level  and  of  0.15  for  significance  at  the  1  percent 
level.  For  parts  III  and  IV,  coefficients  of  0.11  anJ  0.14  are  required 
at  the  corresponding  levels.  None  cf  the  coefficients  in  table  24.12  ap¬ 
proaches  significance. 

(2)  Correlation  between  farts. — The  correlations  between  paired 
scores  of  parts  I  and  II,  and  parts  III  and  IV,  for  a  sample  of  443  pilots 
in  primary  training  who  completed  all  four  parts  of  the  test,  are  pre-  * 
seated  in  table  24.13. 


Table  24.13. —  Correlations  between  paired  scores  of  parts  I  end  II,  and  farts  III 
and  IV,  of  the  Empathetic  Response  Test,  CE715A,  f  or  413  pilots  ivho  took 

primary  training 
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Fear  of  death  or  injury  . 
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The  results  obtained  are  disappointing,  as  only  two  of  the  keys  show 
a  level  of  reliability  at  all  worthy  of  consideration.  The  evasion-confu¬ 
sion  key  scores  only  the  “cannot  decide,1'  the  “do  not  know,”  “do  not 
understand,”  or  “something  not  listed  above"  type  of  answer.  The  sec¬ 
ond  key,  repression,  which  seems  to  have  some  reliability,  contains  many 
of  the  same  types  of  "cannot  decide”  responses  and  probably  measures 
much  the  same  thing  as  the  evasion-confusion  key. 

Evaluation. — Upon  the  basis  of  the  obtained  validities,  neither  parts 
I  and  II  nor  parts  III  and  IV  of  the  F.mpathctic  Response  test  have  any 
validity  for  predicting  success  in  primary  pilot  training.  It  is  possible, 
however,  that  this  instrument  would  yield  a  positive  measure  of  suscepti¬ 
bility  to  combat  neuroses,  more  readily  than  the  picture-presentation  pro¬ 
jective  type  of  test,  because  the  examinee  should  empathize  h.to  in  in¬ 
teresting  story,  written  in  subjective  style,  more  readily  than  into  a  pic¬ 
ture  of  doubtful  meaning  and  interest. 

For  purposes  of  administration,  in  its  present  form  the  test  has  sev¬ 
eral  advantages :  it  is  easy  to  administer  and  relatively  easy  to  score ;  it 
has  considerable  face  validity,  because  the  stories  and  questions  deal  with 
situations  involving  psychological  stress  or  conflict  in  a  military  context. 

Additional  Use  of  Projective  Techniques 

In  this  section  will  be  described  briefly  four  techniques  that  have 
achieved  the  status  of  tests  without  having  been  administered.  Only  a 
brief  rationale  and  a  description  of  the  items  will  be  presented. 

Picture  Evaluation  Test,  CE712A  *• 

The  purpose  of  this  test  is  to  sample  certain  attitudes  that  are  believed 
to  be  indicative  of  personality  factors  predisposing  to  combat  neurosis. 

(1)  Description. — This  test  is  a  modified  projective  technique  attempt¬ 
ing  to  eliminate  individual  differences  in  verbalization.  It  was  constructed 
for  group  admnistration.  Each  item  consists  of  5  pictures — a  large  un¬ 
structured  picture,  which  is  presented  by  means  of  a  lantern  slide  for  5 
seconds,  followed  by  4  smaller  pictures,  each  structured  in  ways  such  as 
persons  with  various  personality  weaknesses  might  interpret  the  large 
picture.  The  examinees  arc  asked: 

(1)  Which  small  picture  is  most  like  the  large  picture? 

(2)  Which  small  picture  tells  the  same  story  as  the  large  picture? 

The  answers  are  set  up  for  machine  scoring  and  arc  recorded  on  the 
standard  IBM  answer  sheet.  Both  questions  arc  to  be  scored,  but  it  is 
felt  that  question  No.  2  will  be  the  more  fruitful. 

As  an  example,  one  large  picture  shows  an  airplane  swooping  dose 
over  the  head  of  a  man  in  the  foreground.  The  man’s  mouth  is  open  as 
if  barking  an  order,  or  screaming,  etc.  The  background  is  indefinite. 

M  Dcyeloptd  ai  Prychological  Rcaearch  Unit  No.  1.  Chief  Contributor:  Capt.  Iloraco  K.  Van 
Stun. 
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The  small  pictures  for  this  large  picture  are: 

(a)  Man  dressed  as  flier  is  approaching  a  plane  (only  tail  assembly 
is  visible)  in  a  normal  airport  setting.  (Normal  or  escape  picture.) 

(£)  A  German  plane  ground-strafing  American  troops.  (Attempting 
to  get  an  undue  awareness  of  the  threat  of  death.) 

( c )  An  obviously  civilian  plane  in  a  minor  landing  crack-up.  (At¬ 
tempting  to  tap  lack  of  interest  or  fear  of  flying.) 

(d)  Bugler  blowing  bugle  with  a  background  of  graves.  (Attempting 
to  reveal  uncontrolled  insecurity  as  expressed  in  interpreting  any  inci¬ 
dent  as  a  threat  to  the  self.) 

It  was  planned  to  construct  at  least  15  large  pictures,  each  with  4 
alternative  small  pictures,  in  the  exploratory  form  of  the  test. 

Picture  Sequence  Test,  CE713A  11 

It  is  assumed  that  two  factors  are  crucial  in  determining  relative  re¬ 
sistance  to  the  stresses  of  sustained  combat  exposure:  (1)  Intensity  of 
anxiety,  generalized  and  concerning  combat,  and  (2)  cr.Iont  of  counter¬ 
acting  control  over  anxieties.  This  test  was  designed  to  measure  pre¬ 
disposition  to  combat  fatigue  in  terms  of  these  two  personality  variables. 

(1)  Description. — This  is  a  nonverbal  test  combining  features  of  both 
the  projective  techniques  and  the  multiple-choice  word-association  tests. 
Each  item  consists  of  a  stimulus  picture,  which  presents  a  situation 
containing  potential  elements  of  stress,  followed  by  two  sets  of  four  pic¬ 
tures  each.  The  task  is  to  construct  a  story  by  selecting  one  picture  from 
the  first  set  and  one  from  the  second.  The  first  set  presents  varying 
stress  elements  in  a  continuum  of  increasing  severity.  The  second  set 
presents  different  completions  of  the  story  which  arc  assumed  to  un¬ 
cover  varying  degrees  of  anxiety  control.  For  example,  one  stimulus 
picture  portrays  a  young  man  sleeping.  The  first  set  of  pictures  portrays: 
(1)  dreams  of  being  carried  ofT  by  an  eagle;  (2)  dreams  of  being 
chased  by  wild  animals;  (3)  dreams  of  falling  from  a  great  height;  and 
(4)  dreams  of  being  trampled  under  a  horde  of  "GI"  shoes.  The  second 
set  of  pictures  are:  (1)  Young  man  sleeping  with  smile  on  face;  (2) 
young  man  sleeping  with  terror  on  face;  (3)  young  man  sitting  up  in 
bed,  tensed;  and  (4)  young  man  sitting  on  edge  of  bed,  face  in  hands. 

It  was  planned  to  construct  40  large  pictures,  plus  30  diversionary 
dummies.  The  test  is  suitable  for  machine  scoring. 

A  Structured  Answer  Projection  Test,  CE714A  11 

This  test  seeks  to  measure,  by  a  rapid-projection  technique,  those 
attitudes  which  predispose  airmen  to  combat  neurosis  or  which  tend  to 
serve  as  antidotes.  These  include  such  variables  as  belief  in  immediate 
superiors,  conviction  of  personal  invulnerability,  belief  in  wortlnvhile- 
ness  of  own  coni i  bution  to  the  war  effort,  and  poor  social  adjustment. 

*•  Developed  at  Psychological  Research  Uni  No.  I.  Chief  Contributor:  Set.  I-eo  Srole. 

** Developed  at  Psychological  Research  Unit  No.  1.  Chief  Contributor:  Lt.  Martin  Singer. 
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Description. — Fifty  pictures  are  presented  to  the  examinees,  who  are 
required  to  select  the  best  description  of  a  picture  from  four  multiple- 
choice  responses.  The  four  possible  responses  are  selected  to  reveal  pro¬ 
jections  ranging  respectively  from  those  most  indicative  of  adjustment 
to  those  most  indicative  of  maladjustment  to  training  and  combat  situa¬ 
tions.  The  preface  to  each  picture  and  the  composition  of  the  picture  it¬ 
self  attempt  to  make  all  examinees  identify  themselves  with  the  same 
character  and  from  their  perspective  as  aviation  students.  The  pictures 
are  as  follows: 

(1)  Ten  pictures  with  obvious  responses;  the  first  two  presented  as 
samples  and  the  other  eight  scattered  throughout  the  tests  to  make  the 
examinees  feel  they  are  taking  a  test  of  observational  ability.  These  eight 
obvious  pictures  may  also  uncover  the  extreme  personality  deviates. 

(2)  Twenty  original  pictures  ambiguous  enough  so  that  interpretation 
invites  projection  on  the  part  of  the  examinees. 

(3)  Ten  pictures,  from  the  Rapid  Projection  test,  CE711B,  so  that 
the  results  of  the  present  technique  may  be  compared  to  the  results  ob¬ 
tained  with  that  instrument.  Multiple  choices  arc  also  used  here. 

(4)  Ten  pictures  used  in  the  Picture  Evaluation  Test,  CE712A,  for 
comparison  purposes.  Multiple-choice  questions  of  the  same  order  as 
described  above  are  used. 

As  an  example,  the  following  questions  arc  asked  concerning  a  picture 
of  an  officer  who  has  just  pulled  the  rip  cord  of  his  parachute. 

From  his  expression: 

a.  He  is  getting  set  for  the  shock  of  the  parachute  opening. 

b.  The  parachute  has  opened  and  he  is  calmly  floating  down. 

c.  He  is  terrified,  because  the  parachute  has  failed  to  open. 

d.  He  is  worried  whether  or  not  the  parachute  will  open. 

3  2  I  0  - 1  -2  -3 
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FIGURE  24.1 

SCALE  USED  IN  PICTURE  JUDGMENT, 

CE7I6A 


Picture  Judgment  Test,  CE716A  11 

It  was  hoped  that  this  test  would  measure  degree  of  fear  and  sensitivity 
to  combat  situations,  and  thus  make  a  contribution  to  the  prediction  of 
susceptibility  to  combat  neurosis. 

» Developed  "it  Psychological  Research  Unit  No.  t.  Chief  contributor*:  Sgt.  Harold  M.  Pro* 
ahanjky  and  Cpl.  Walter  J.  Ret*. 
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Description. — The  examinees  arc  asked  to  rate  150  pictures  projected 
on  a  screen. 

These  pictures  will  show  24  pleasant  nonmilitary  situations,  25  neutral 
nonmilitary  situations,  25  ambiguous  nonmilitary  situations,  38  un¬ 
pleasant  nonmilitary'  situations,  and  38  combat  situations. 

It  will  be  established  statistically  that  the  pictures  finally  selected  have, 
for  the  majority  of  aviation  students,  the  qualities:  Pleasant,  neutral, 
ambiguous,  and  unpleasant.  It  is  expected  that,  canceling  out  the  constant 
factor  of  strong  or  weak  judgments,  aviation  students  who  are  more 
sensitive  to  combat  situations  will  (1)  cither  mark  a  greater  number  of 
combat  pictures  in  the  unpleasant  categories  than  their  comrades  or  (2) 

avoid  the  situations  by  marking  combat  pictures  neutral. 

# 

OBSERVATIONAL  TECHNIQUES 

Observational  techniques  yield  data  on  the  manner  in  which  the  exami¬ 
nee  performs  a  task  and  provide  estimates  of  his  attitudes.  In  general, 
the  primary  function  of  observational  data  is  to  furnish  descriptions  of 
the  examinee’s,  overt  behavior  and  interpretation  of  that  behavior.  The 
observer  is  considered  to  be  a  sort  of  complex  measuring  instrument 
who  integrates  his  interpretations  of  the  various  observed  reactions  of 
the  examinee  into  a  complex  judgment,  which  is  recorded.  It  was  felt 
that  the  observational  techniques  would  yield  data  regarding  personality 
traits  that  would  supplement  similar  data  obtained  by  other  clinical  tech¬ 
niques.  Their  independent  value  in  prediction  of  air-crew  success  was 
also  to  be  determined. 

Conference  for  the  Interpretation  of  Test  Scores  and 
Occupational  Background,  CE707A  M 

The  interview  method  was  considered  basic  to  any  project  employing 
clinical  procedures.  The  purpose  of  the  conference  technique  was  to  at¬ 
tempt  to  obtain  and  to  analyze  the  candidate’s  immediate  and  remote  ex¬ 
periences,  so  that  classification-tcst  scores  could  be  interpreted  in  the 
light  of  his  background,  as  well  as  to  give  an  estimate  of  the  candidate’s 
chance  of  success  in  air-crew  duties.  The  information  to  be  obtained  is 
not,  as  yet,  available  through  any  other  technique. 

Description. — The  designation,  “Conference  for  the  Interpretation  of 
Test  Scores  and  Occupational  Background,”  was  given  to  this  procedure 
in  order  to  avoid  the  undesirable  connotations  of  the  term  "interview,” 
which  might  influence  rapport  with  the  aviation  students.  This  interview 
technique  permits  the  follow-up  of  leads  and  the  asking  of  questions  that 
are  difficult  to  put  on  paper  or,  if  placed  on  paper,  are  easily  distorted 
by  the  examinee.  The  interview  situation  is  made  as  informal  and  re¬ 
laxed  as  poss:ble,  so  as  to  elicit  a  maximum  of  information  concerning 
the  examinee. 

M  Developed  at  Psychological  Retearcb  Unit  No.  1.  Chief  contributor*:  Clinical  Pn  -cdureo 
Croup. 
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(1)  Internal  characteristics. — The  conference  was  conducted  by  one 
interviewer  for  one  candidate  at  a  time.  It  was  guided  by  a  standard  set 
of  questions  and  a  list  of  the  fields  to  be  covered.  During  the  interview, 
however,  the  manner  of  putting  the  questions  was  left  to  the  discretion 
of  the  interviewer  in  order  to  encourage  optimum  rapport.  Questions, 
for  the  most  part,  were  general  and  gave  the  examinee  an  opportunity 
to  enlarge  upon  any  point.  Few  questions  were  aimed  at  obtaining  purely 
factual  information,  because  mast  of  this  type  of  data  was  obtained  pre¬ 
viously  on  the  printed  questionnaires  (Occupational  Classification  Ex¬ 
perience  Blank,  CE603A,  and  Aviation  Experience  Blank).  Specifically, 
the  interview  was  planned  to  reveal:  (1)  The  relationship  of  the  past 
history  of  the  examinee  to  his  present  personality;  (2)  what  he  has 
learned  in  terms  of  his  opportunity  to  learn;  (3)  the  developmental 
sequences  of  his  life  history;  (4)  his  occupational  history  in  relation  to 
his  likelihood  for  air-crew  success;  (S)  additional  material  to  aid  in  the 
interpretation  of  scores  and  measures  derived  from  the  Thematic  Ap¬ 
perception  Test,  CE706A,  the  Rorschach  Test,  CE701A,  and  other 
printed  tests  of  emotion  and  temperament;  and  (6)  estimates  (on  a  rat¬ 
ing  scale)  of  the  examinee’s  self  confidence,  to  aid  in  the  interpretation 
of  scores  on  the  confidence  tests. 

(2)  Administration. — Each  conference  was  of  approximately  50  min¬ 
utes’  duration.  The  main  emphasis  in  the  content  of  the  interview  was  in 
four  areas:  (1)  Occupation,  (2)  motivation,  (3)  stability,  and  (4) 
adaptability.  The  conference  was  so  administered  that  adequate  infor¬ 
mation  would  be  available  for  a  written  report  to  include  information 
under  the  following  categories : 

(a)  Thumbnail  sketch. — Included  characteristics  of  motivation,  sta¬ 
bility,  and  adaptability,  as  well  as  such  factors  as  physical  build,  vigor, 
dress,  coordination,  language,  response  to  interviewer,  and  excessive 
physical  activity. 

(b)  Occupational  history. — This  section  contained  information  sup¬ 
plemental  to  the  Occupational  Background  Blank;  job  ambitions  before 
the  war;  preference  for  type  of  job  5  years  from  now;  age  at  which  ex¬ 
aminee  first  earned  money;  why  a  particular  job  was  taken;  and  special 
skill  and  interests.  Discrepancies  between  level  of  aspiration  and  level 
of  achievement  were  noted. 

(c)  Family  relationships. — This  area  contained  a  brief  description  of 
family  composition;  imitation  of  areas  in  which  examinee  would  prefer 
to  rear  children  differently  than  in  the  manner  in  which  he  was  brought 
up;  type  of  work  which  his  mother  and  father  would  prefer  for  him; 
and  their  attitude  toward  combat  flying.  Important  in  this  section  were 
critical  points  in  familial  adjustment;  such  as  the  possibility  of  destruc¬ 
tive  sibling  rivalry,  the  effect  of  marital  discord  on  the  examinee’s  dis¬ 
cipline,  goals,  and  ideals. 
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(d)  Developmental  history. — This  area  was  concerned  with  an  evalu¬ 
ation  of  Self  in  early  childhood  and  in  the  preschool  area;  adequacy  of 
"social  adjustment”;  and  physical  adequacy  in  terms  of  any  remembered 
severe  illnesses. 

(e)  School  experiences — Described  the  age  at  which  school  was  be¬ 
gun  and  concluded,  best  and  poorest  subjects;  likes  and  dislikes  in  the 
school  situation;  behavior  or  disciplinary  problems;  extra-cvrricular 
participation;  level  of  aspiration  in  relation  to  school  and  the  extent  to 
which  this  was  reached;  competition  with  siblings;  reactions  to  teachers; 
and  ambitions  in  relation  to  further  education. 

(/)  Leisure  time  interests  and  hobbies. — This  section  was  a  descrip¬ 
tion  of  the  organizations  in  which  the  examinee  had  been  a  member; 
how  leisure  time  was  spent;  what  things  afforded  the  most  pleasure; 
what  things  "got  him  down?”;  participation  in  athletics  and  in  commu¬ 
nity  organizations,  and  level  of  participation  (member  or  officer). 

(g)  Socialization. — A  description  was  given  of  the  moot  difficult  prob¬ 
lem  faced;  type  of  individual  most  admired;  relation  with  girls;  social 
skills  possessed;  role  in  the  family;  extent  to  which  independence  from 
family  control  has  been  obtained. 

(h)  Army. — This  section  described  ambitions  ir  the  Air  Corps; 
evaluation  of  preparedness  for  this  ambition;  estimation  of  chances  of 
getting  wings;  feelings  in  regard  to  combat;  reasons  for  applying  to 
the  Air  Corps;  plans  if  “washed  out”  of  training;  reactions  of  parents 
and  friends  to  his  being  in  the  Air  Corps. 

(3)  Scoring. — Three  types  of  treatment  wxre  accorded  the  material 
obtained:  (1)  An  interview  report  was  written,  following  the  several 
categories  just  enumerated;  (2)  predictions  for  success  in  air  crew  in 
general,  and  for  the  specialties  of  bombardier,  navigator,  and  pilot  were 
made,  each  on  a  nine-point  scale;  (3)  ratings  of  confidence  were  made 
on  a  five-point  scale:  Complete  lack  of  self  confidence;  underconfident; 
confident ;  overconfident ;  and  complete  overconfidence. 

Statistical  results. — Owing  to  the  nature  of  the  data,  only  a  restricted 
number  of  statistical  procedures  were  attempted. 

(1)  Test  reliability. — Since  any  estimation  of  reliability  based  on  this 
instrument  due  to  the  method  of  administration  would  involve  several 
unwarranted  assumptions,  no  reliabilities  were  computed.  It  would  be 
possible  to  determine  the  reliability  of  ratings,  if  two  or  more  raters 
were  to  make  independent  clinical  predictions  of  success  based  on  the 
information  contained  in  the  interview  summary. 

(2)  Test  ivlidity.— Ginical  predictions  on  a  nine-point  scale  of  suc¬ 
cess  or  failure  in  elementary  pilot  training  are  available  for  validation. 
These  ratings  represent  an  over-all  evaluation  of  the  interview  material, 
made  at  the  end  of  the  conference  and  after  the  interviewer  had  com¬ 
pleted  his  interview  summary.  The  ratings  of  the  interviewers  were 
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converted,  for  purposes  of  comparison,  into  a  common  scale,  with  a 
mean  of  4.5  and  a  standard  deviation  of  1.5.  Validation  results  are  given 
in  table  24.14. 

Table  24.14. —  Validation  data  for  clinical  pre dictions  based  on  two  groups  of  pilots 
in  primary  training  using  the  graduation-elimination  criterion,  for  the  Conference 
for  the  Interpretation  of  Test  Scores  and  Occupational  Background,  CE707A 

(an  interline) 
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Neither  of  these  validity  coefficients  is  significantly  different  from 
zero.  Clinical  predictions  of  air-crew  success  derived  from  the  interview 
do  not  significantly  differentiate  graduates  and  climinccs  from  elementary 
flying  training. 

Ratings  of  self  confidence  on  a  five-point  scale  are  also  available.  These 
ratings  were  an  over-all  evaluation  of  the  examinee’s  confidence  as  mani¬ 
fested  in  his  behavior  during  the  interview  and  in  the  life-history  data. 
The  ratings  were  converted  to  a  scale  with  a  mean  of  3.00  and  a  stand¬ 
ard  deviation  of  0.75.  The  biserial  validity  coefficients  of  the  confidence 
ratings  were  determined  on  a  sample  of  293  pilots  in  elementary  training, 
80  percent  of  whom  graduated.  The  biserial  coefficient,  —0.07,  was  not 
significantly  different  from  zero. 

The  correlations  of  the  clinical  predictions  based  on  CK707A  with 
those  based  on  other  techniques  are  generally  low.  These  results  for  sev¬ 
eral  samples  of  pilots  in  elementary  training  arc  presented  in  table  24.15. 

TAm.E  24.15.— Correlations  of  clinical  predictions  based  upon  Conference  far  the 
Interpretation  of  Test  Scores  and  Occupational  Background,  CE707A,  with 
predictions  based  upon  other  techniques 
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Analysis  of  the  factual  material  in  the  interview,  or  of  thcniajor  areas 
of  motivation,  socialization,  emotional  stability,  and  occupational  back¬ 
ground  were  not  attempted. 
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Evaluation. — The  influence  of  examiner  differences  on  the  interview 
procedure  and  results  was  not  determined  in  this  study.  There  were  sev¬ 
eral  phases  of  the  interview  procedure  in  which  the  factors  of  skill  and 
intelligence  of  the  examiner  might  become  a  critical  variable:  (a)  In 
writing  the  interpretive  summary  and  evaluating  the  factual  content  of 
the  interview  for  areas  of  personality  such  as  stability,  maturity,  adapta¬ 
bility,  and  motivation;  ( b )  in  questioning  and  probing  certain  responses, 
taking  up  various  leads  to  questions,  as  from  suggestive  remarks,  and 
following  them  through;  (c)  in  making  the  clinical  predictions;  evalu¬ 
ating  the  relevance  and  importance  of  various  phases  of  the  life-history 
data,  relating  them  with  the  examiner’s  conception  of  the  personality  fac¬ 
tors  required  for  success  in  air-crew  performance,  and  making  a  clinical 
rating  on  this  basis. 

The  clinical  predictions  depended  partly  on  an  examiner’s  own  views 
of  the  requirements  of  air-crew  duties  with  respect  to  personality  factors, 
and  partly  on  his  skill  as  an  interpreter  of  the  factual  life-history  mate¬ 
rial  in  relation  to  various  areas  of  personality.  In  arriving  at  their  pre¬ 
dictions,  the  interviewers  were  forced  to  determine  for  themselves  the 
relative  weights  to  be  assigned  to  the  various  facts  In  the  life  history. 
Facts  of  life  history  that  were  elicited  also  differed  from  one  examiner 
to  another  and  from  one  examinee  to  another.  It  seems  important  to  note 
that  the  interpretations  of  the  factual  materials  were  in  no  way  deter¬ 
mined  by  uniform  criteria  or  by  a  single  theory  of  personality.  It  seems 
legitimate  to  question  whether  the  validity  data  pertaining  to  the  ratings 
reflect  the  true  validity,  cither  of  the  factual  data  of  the  interview  alone 
or  of  the  interpretive  features  of  the  interview  relevant  to  various  areas 
of  personality. 

The  analysis  presented  here  does  not,  by  any  means,  exhaust  the  pos¬ 
sible  analyses  or  further  uses  of  the  data.  The  data  were  not  used  in  con¬ 
nection  with  two  of  the  original  aims  for  which  they  had  been  intended,  the 
final  conference  and  case  history. 

Observation  of  Atypical  Behavior  During  Peychomotor 
Testing,  CE708A  u 

This  technique  is  based  on  the  assumption  that  both  the  psychomotor 
tests  and  the  air-crcw  training  situations  involve  the  performance  of  com¬ 
plex  tasks  under  stress.  It  is  thought  that  evidence  of  unusual  tension, 
confusion,  or  excessive  emotionality  observed  in  the  testing  situation  might 
serve  as  a  basis  for  predicting  similar  behavior  in  the  air-crew  training 
situation. 

P ascription.  (1)  Internal  characteristics. — This  is  not  a  test  but  rather 
a  descriptive  rating  procedure  utilized  when  atypical  behavior  is  observed 
by  the  regular  psychomotor  examiner.  The  procedure  requires  the  exam¬ 
iner  to  record  any  behavior  that  he  considers  atypical  in  each  group  of 
four  examinees  taking  the  psychomotor  test  simultaneously. 
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It  was  believed  that  the  experience  of  the  examiner  with  the  psycho¬ 
motor  test  would  provide  a  sound  basis  for  deciding  whether  individual 
behavior  indicated  tension,  confusion,  or  any  other  striking  character¬ 
istic.  The  object  of  these  observations  was  to  isolate  the  individuals  who 
were  judged  to  represent  the  extremes  of  atypical  behavior. 

(2)  Administration. — When  atypical  behavior  was  observed,  the  ex¬ 
aminer  wrote  a  brief  description  of  it  after  the  four  examinees  had  left 
the  testing  room.  The  examiner  also  indicated  the  adequacy  of  his  oppor¬ 
tunities  for  observing  the  examinees. 

In  the  use  of  this  technique,  two  rules  were  observed:  (1)  The  exam¬ 
inees  were  given  no  reason  to  believe  that  their  conduct  during  the  psy¬ 
chomotor  tests  was  being  observed;  for  example,  the  data  sheets  were 
concealed  from  the  examinees,  and  (2)  the  observational  data  obtained 
by  each  examiner  were  independent  from  similar  data  obtained  by  other 
examiners;  for  example,  the  examiners  did  not  discuss  observational 
findings  on  any  examinee  with  each  other. . 

(3)  Scoring. — The  examiner  rated  only  those  men  considered  atypi¬ 
cal.  He  checked  appropriate  categories  on  the  rating  sheet  and  also  wrote 
a  very  brief  description  of  the  behavior  observed  after  the  appropriate 
categories  provided  on  the  data  sheet. 

Briefly  the  categories  were:  (1)  Tension,  the  examiner  indicating  and 
describing  undue  neurotic  tension;  (2)  confusion,  including  poor  atten¬ 
tion,  misunderstanding  of  test  task,  or  erratic  performance;  (3)  verbal¬ 
izations,  exclamations,  rationalizations,  and  nervous  speech;  (4)  disobe¬ 
dience,  including  willful  disregard  of  instructions;  and  (5)  other.  In 
addition,  the  examiner  checked  the  adequacy  of  his  observation  as  either 
good,  fair,  or  poor. 

Statistical  results. — No  statistical  data  arc  available  on  this  procedure. 
Reliability  coefficients  were  not  determined.  The  basis  for  computation 
would  be  observation  of  the  same  examinee  in  a  given  test  by  two  or 
more  examiners.  This  was  not  done.  Only  occasionally  did  two  or  more 
examiners  report  atypical  behavior  on  the  same  examinee.  Validity  data 
were  not  obtained.  Preliminary  analysis  of  the  data  indicated  that  atypi¬ 
cal  behavior  was  noted  in  approximately  one  out  of  eight  examinees. 

Observation  During  Psychomolor  Testing  Rest  Period,  CE709A  *• 

• 

The  design  of  this  procedure  was  based  on  the  theory  that  a  g'xxl 
measure  of  individual  personality  characteristics  might  be  obtained  in  a 
relatively  informal,  social  situation  structured  to  evoke  relatively  spon¬ 
taneous  and  uninhibited  comments,  expressions  of  attitudes,  and  behavior. 
It  was  hoped  that  the  observational  material  gained  in  this  maimer  would 
aid.  eventually,  in  giving  valuable  material  for  use  in  case  histories  and 
in  making  composite  predictions  for  each  cadet  based  on  all  the  clinical 
procedures. 
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Description. — CF.709A  is  not  a  test  in  the  sense  of  having  standard¬ 
ize'!  questions  to  he  answered.  Rather,  the  men  are  placed  in  a  situation 
that  is  thought  to  he  sufficiently  provocative  to  cause  spontaneous  reac¬ 
tions  to  the  psychological  testing  and  to  their  fears  and  hopes  in  regard 
to  Hying. 

(  I  )  Internal  characteristics. — The  procedure  utilizes  a  waiting  station 
in  the  psychomotor  test  section  for  groups  of  four  examinees  during  the  I 

15-minute  period  immediately  following  the  Finger  Dexterity  test.  The  j 

group  is  observed,  insofar  as  possible  without  their  knowledge,  by  the  j 

examiner  who  participates  in  the  social  situation.  In  order  to  provoke  • 

conversation  and  reactions  indicating  significant  motives  or  attitudes,  cer¬ 
tain  objects  calculated  to  cause  comment  were  placed  in  the  room.  On  \ 

the  wall  were  a  cadet  recruiting  poster  and  pictures  from  Life  magazine  i 

representing  the  duties  of  bombardier,  navigator,  and  pilot.  In  addition, 
there  were  scattered  about  the  room  a  number  of  parts  from  wrecked  air-  j 

planes — a  broken  propeller,  a  burned  brake  drum,  and  a  number  of  broken  j 

and  twisted  instruments.  i  j 

(2)  eldministration. — The  examinees  were  taken  to  the  waiting  room  j 

for  the  15-minute  period  following  the  finger  dexterity  test.  The  exam-  j 

iner  was  there  ostensibly  to  supervise  the  filling  out  of  appointment  slips  j  j 

for  the  clinical  group  tests.  In  this  way  the  examiner’s  function  was  not 
open  to  question.  This  procedure  took  no  more  than  2  minutes,  which  - 

allowed  time  for  the  examiner  to  make  the  necessary  observations  and  . 

ratings.  After  the  ap|K>intmcnt  slips  were  filled  out,  the  examiner  in-  ■ 
formed  the  men  that  they  were  to  remain  in  the  room  for  the  remainder 
of  the  rest  period  but  were  free  to  sit  or  move  around,  to  make  them¬ 
selves  comfortable,  and  to  talk  as  much  as  they  liked.  The  examiners 
were  cautioned  (a)  not  to  observe  the  students  too  cautiously,  ( b )  to 
keep  all  data  sheets  out  of  the  room,  (c)  to  act  as  natural  as  possible 
so  as  not  to  arouse  suspicion,  and  (</)  to  join  in  the  discussions  as  much  j 

as  necessary  and  ask  informal  questions  such  as,  "How  arc  the  tests  j 

going?”  in  order  to  provoke  discussion. 

At  the  conclusion  of  the  observation  period,  the  four  examinees  moved 
on  to  tlie  next  psychomotor  test.  The  examiner  went  to  another  room 
to  write  up  his  findings,  while  an  alternate  examiner  made  the  observa¬ 
tions  on  the  next  group. 

(it)  Scoring. — In  the  separate  room  the  examiner  filled  out  a  check 
ILt  of  traits  and  then  wrote  a  thumbnail  sketch  of  each  man.  The  check  « 
list  included  22  traits:  Participant,  nonparticipant,  leader,  entertainer, 
withdrawn,  stoiid,  nervous,  elated,  depressed,  curious,  confident,  profane, 
rationalizing,  seeks  group  approval,  seeks  examiner’s  approval,  worried, 
griped,  tense,  pleasing  to  examiner,  annoying  to  examiner,  idealistic,  and 
realistic.  Definitions  of  these  categories  were  furnished  so  that  ratings 
would  be  consistent.  A  double  check  was  used  to  indicate  an  extreme  j 

degree  of  the  traij;.  I  j 

*  i 

i 
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1  he  thumbnail  sketch  was  aimed  at  defining  the  individual,  and  was 
intended  to  lead  toward  the  rating  of  personality  fitness  for  air  crew. 
It  was  not  to  be  a  restatement  of  traits  that  were  checked ;  it  was  in* 
tended  to  be  more  general  and  to  be  expressed  in  spontaneous  terminol¬ 
ogy.  The  findings  were  to  be  used  to  support  the  rating  of  personality 
fitness  for  air-crew  training  which  was  made  on  a  nine-point  scale  adopted 
in  the  Clinical  Techniques  Project. 

Statistical  results. — Observations  were  made  on  approximately  600 
aviation  students. 

(1)  Test  validity. — The  validation  coefficients  obtained  for  the  clinical 
predictions  arc  given  in  table  24.16. 


Taiu.e  24.16. — Validation  data  for  clinical  predictions  based  on  Observation  during 
Psychomotor  Testing  Rest  Period,  CP-7 09 A,  using  the  graduation-climiiuition 

criterion 


N, 

D 

M, 

M. 

SD, 

r*l« 

«r»i« 

273 

0.90 

•  •  * 

mm 

0.10 

•  •  • 

176 

.82 

4.23 

3.66 

HH 

.21 

.22 

Correlations  with  predictions  based  on  other  clinical  techniques  are 
shown  in  table  24.17. 


Table  24.17. —  Correlations  of  predictions  for  samples  of  pilots  in  primary  training 
based  on  Observations  during  Psychomotor  Testing  Rest  Period,  with  predictions 

from  other  clinical  techniques 


Name  of  technique 

N 

r 

Interview  (CE707A)  . . 

300 

0.06 

Observational  Stress  Test  (CE710A)  ..... 

291 

.02 

Interaction  Test  (CE425)  . 

300 

«.16 

Interview  (CK707A)  . 

174 

.13 

Rorschach  1st  Impression  (CE701A-1)  ... 

173 

*  —  .17 

Observational  Stress  Test  (CE7I0A)  ..... 

173 

.06 

Interaction  Test  (CE425D)  . . . 

170 

-.07 

'  Significant  at  the  1  percent  level. 
’  Significant  at  the  S  percent  level. 


Evaluation. — On  the  basis  of  present  statistical  treatment,  the  results 
obtained  with  this  observational  technique  indicate  that  it  has  rather 
questionable  value  for  the  prediction  of  air-crew  success.  Certain  features 
of  the  test  limited  its  usefulness:  (a)  The  examiners  making  the  obser¬ 
vations  and  predictions  had  little  previous  experience  or  training  with 
clinical  ratings;  ( b )  predictions  were  colored,  in  part,  by  examiner 
stereotypes  concerning  the  personality  requirements  of  air-crew  duties; 
(r)  the  attitude  of  the  examinees  may  have  been  conditioned  by  the 
number  of  psychomotor  tests  they  bad  taken  prior  to  the  rating  period; 
(</)  since  in  many  cases  the  rest  period  came  after  the  students  bad 
taken  a  number  of  tests,  the  predictions  in  an  undetermined  number  of 
cases  were  possibly  based  more  on  comments  the  students  made  about, 
how  well  they  bad  done  than  was  true  in  other  cases.  Some  of  these  de- 
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ficicncies  could  obviously  be  corrected  and  others  minimized.  The  clinical 
observations  from  this  test  were  otiginally  intended  for  use  in  a  case- 
conference  procedure,  rather  than  being  used  alone,  but  due  to  the  aban¬ 
donment  of  that  part  of  the  project,  no  further  clinical  use  of  the  data 
was  attempted. 


Observational  Stress  Technique,  CE710A  lT 


The  observational  Stress  Technique  was  designed  in  an  attempt  to  rep¬ 
resent  the  type  of  stress  the  aviation  trainee  would  experience  in  the 
instructor-student  relationship  during  Hying  training.  It  was  assumed 
that  his  performance  under  critical  observation  would  be  best  measured 
by  tasks  requiring  divided  attention,  selective  responses,  and  relatively 
fine  controlled  manipulations.  The  hypothesis,  then,  is  that  the  examinee’s 
poise  and  self  assurance  or  his  tendency  to  become  confused  or  blocked, 
as  measured  by  this  technique,  is  related  to  success  in  air-crew  training. 

Description. — Since  this  is  primarily  an  apparatus  test  and  will  be 
descrilied  as  such  iu  another  report  of  this  scries,  only  those  aspects 
more  pertinent  to  the  stress  technique  as  measured  by  ratings  will  be  men¬ 
tioned  here. 

( 1 )  Internal  characteristics. — The  apparatus  consists  of  seven  exami¬ 
nee  s  controls  mounted  in  a  table.  The  examinee’s  task  is  to  stop  the 
hand  of  a  clock  by  keeping  all  seven  controls  set  correctly  at  the  same 
time.  The  controls  consist  of  a  foot  pedal  operated  by  the  right  foot,'  a 
stick  operated  by  the  right  hand,  and  five  levers  manipulated  by  the  left 
hand.  These  five  levers  arc  an  assembly  of  throttle,  mixture,  and  pro- 
jx-Ilcr-pitch  controls  adapted  from  a  light  plane  and  an  assembly  of  two 
controls  set  by  a  thumb  catch.  The  correct  setting  of  each  of  six  of  the 
controls  is  indicated  to  the  examinee  when  a  corresponding  signal  light 
is  illuminated  and  of  the  seventh  control  when  a  corresponding  buzzer 
shuts  off.  During  the  test,  the  correct  setting  for  each  control  is  changed 
frequently  by  the  examiner,  according  to  a  standardized  schedule. 

(2)  Administration. — One  examinee  at  a  time  is  tested  by  one  exam¬ 
iner  and  one  observer.  The  directions  require  the  examiner  to  make 
standardized  criticism  of  the  examinee’s  performance  in  an  attempt  to 
increase  the  stress  aspect  of  the  situation. 

The  examiner  and  observer  arc  separted  from  the  examinee  by  a  one¬ 
way-vision  screen,  so  that  the  examinee  may  be  observed  without  his 
being  able  to  note  any  unstaudardized  reaction  on  the  part  of  the  examiner 
or  observer.  1  lie  examiner  administers  the  directions  and  the  stress  part 
of  the  test.  1  lie  observer  records  his  detailed  observations  of  the  exami¬ 
nees  performance  on  a  prepared  data  sheet,  and  also  records  6  clock 
scores  which  give  the  error  times  on  several  of  the  controls  during  each 
of  the  three  test  penods. 
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When  the  examinee  enters  the  room,  he  is  told  in  a  forceful  and  critical 
manner  that  during  the  test  he  will  be  under  constant  observation  from 
behind  a  screen.  Then  he  is  told  to  seat  himself  so  that  he  can  manipulate 
the  controls  easily.  The  test  sequence  follows : 

(a)  Anticipation  period  (I  minute):  Examinee  sits  for  1  minute 
without  further  instruction.  If  he  touches  the  controls,  speaks,  or  at¬ 
tempts  to  rise,  he  is  told  to  remain  seated  at  case. 

(b)  Directions  period  (1  minute):  Full  instructions  are  given  orally 
concerning  manipulation  of  the  controls.  Since  they  have  little  signifi¬ 
cance  in  connection  with  the  observational  technique  described  here,  the 
instructions  will  not  he  quoted. 

(r)  Test  period  A  (3  minutes) :  During  the  first  minute  of  this  pe¬ 
riod,  the  examinee  attempts  to  match  only  one  pattern.  During  the  next 
2  minutes  of  this  period,  the  pattern  changes  4  limes  at  standardized  in¬ 
tervals  of  30  to  45  seconds.  At  approximately  the  same  time  intervals, 
the  examiner  administers  standardized  stress  directions: 

Stop  the  clock  quickly.  Always  set  the  R  control  first.  Set  the  stick  and  pedal 
controls  next  and  keep  their  lights  on  while  you  set  the  rest  of  the  controls.  Set  the 
"T"  control  next  after  these  three  •  •  •  If  your  movements  are  jerky  you  will 
get  a  very  poor  observation  score.  Keep  that  stick  away  from  the  side  of  the  slot! 

•  *  *  Turn  the  buzzer  off  *  *  *  If  you  make  the  lights  flicker,  it  shows  us 
that  you  are  tense  •  •  •  Watch  yourself  •  •  •  A  record  is  being  made  of 
every  false  move  that  you  make  *  •  •  Be  quick  I 

(d)  Test  period  B  (3  minutes)  :  During  the  first  1  minute  and  15  sec¬ 
onds  the  examinee  has  only  one  pattern  to  match,  while  during  the  last 
1  minute  and  45  seconds  the  patterns  change  4  times  in  a  standardized 
fashion  as  in  Test  Period  A.  Again  the  examiner  administers  standard¬ 
ized  stress  directions: 

Set  the  controls  I  •  *  *  You  must  work  more  quickly  •  •  •  Your  scores 
are  not  nearly  good  enough  yet.  Remember  we  are  rating  you  the  same  way  a 
primary  instructor  would  rate  you  on  your  flying  •  •  •  You  will  have  to  do 
things  exactly  right  or  you  are  through  •  •  •  Are  you  letting  a  simple  test 
like  this  confuse  you  ?  *  •  *  By  now  you  should  be  able  to  get  all  the  lights  on 
quickly  and  get  a  good  clock  score.  But  how  well  are  you  really  doing  ?  Size  your- 
sclf  up  honestly  •  •  •  You  arc  still  making  too  many  errors. 

(c)  Rest  period  (1  minute)  :  During  this  rest  period  the  clock  scores 
are  recorded  and  observational  ratings  made.  Stress  directions  continue: 

*  •  •  You  will  wait  while  we  record  your  observation  score  •  *  *  You 
are  still  under  critical  observation. 

(/)  Test  period  C  (2  minutes)  :  During  this  test  period  the  adminis¬ 
trative  procedure  is  directed  toward  failure  stimulation.  The  patterns  are 
changed  6  times  after  intervals  of  15,  20,  15.  15,  10,  ami  40  seconds,  in 
order.  Directions  are: 

Don’t  make  lights  flicker  on  ainl  off.  Ik  steady  •  •  •  quit  making  errors.  You 
aren’t  moving  fast  enough  •  •  •  More  speed  •  •  •  Hurry  and  stop  the 
clock  •  •  •  Last  chance  •  •  *  Set  controls  quickly  •  •  •  You  are 
still  making  errors. 
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(3)  Scoring. — In  this  account,  we  arc  interested  only  in  two  types  of 
data:  Ratings  of  observed  traits  and  a  general  prediction  of  air-crew 
success. 

Ratings  of  behavior  were  made  by  the  observer  during  each  of  the  six 
periods  of  the  test.  During  the  test  period  the  examinee  was  rated  on 
such  characteristics  as  general  manner  (poised,  ill  at  ease,  relaxed,  tense, 
confident,  confused,  or  blocked) ;  comprehension  of  task  (adequate,  un¬ 
certain,  or  poor)  ;  operation  of  controls  (erratic,  smooth,  hesitant,  im¬ 
petuous,  cautious,  exaggerated) ;  reactions  to  criticism  (obedient,  ignores 
it,  confused,  annoyed,  slow,  or  prompt)  ;  incidental  behavior  (frowning, 
absorbed,  anxious,  grinning).  The  rating  consisted  merely  in  noting 
whether  or  not  the  examinee  exhibited  these  characteristics.  In  addition, 
whenever  possible  the  observer  wrote  a  thumbnail  sketch  to  supplement 
and  clarify  the  ratings. 

Predictions  of  air-crew  success  were  made  by  the  observer  and  exam¬ 
iner  together  on  a  nine-point  scale  for  probability  of  success  in  air 
crew,  in  general,  and  as  a  bombardier,  navigator,  and  pilot,  specifically. 

Statistical  results. — The  results  are  based  on  a  sample  of  classified 
student  pilots. 

(1)  Validation  data. — The  observer’s  ratings  of  the  subject’s  expres¬ 
sive  behavior  during  the  test  periods  were  to  be  used  in  a  final  case  con¬ 
ference.  Since  the  case  conference  procedure  was  abandoned,  these  ob¬ 
servational  data  were  never  utilized  in  a  clinical  manner. 

The  validity  of  predictions  of  success  was  determined  only  for  pilot 
primary  training.  The  data  arc  given  in  table  24.18. 


Table  24.18. —  Validation  data  for  clinical  predictions  based  on  a  group  cf  pilots 
in  primary  training,  using  graduation-elimination  criteria,  for  the 
Observational  Stress  Test,  CE710A 


N, 

>• 

SO, 

fM« 

/ill 

396 

0  W 

•  •  • 

0.15 

o.ir 

119 

.81 

4.59 

575 

1.47 

.55 

1  Si(aihcanl  <l  uc  1  percent  level. 


The  marked  improvement  in  validity  of  the  second  over  the  first  group 
of  examinees  is  considered  to  be  attributable  to  the  observer  becoming 
more  efficient  through  practice  in  the  prediction  of  cliniinces.  Since  the 
clinical  predictions  arc  strongly  related  to  the  clock  scores  (r=0.61),  it 
is  possible  that  most  of  the  validity  of  the  ratings  is  due  to  the  examin¬ 
ers’  knowledge  of  how  well  the  subject  was  doing  rather  than  to  obser¬ 
vations  of  behavior  apart  from  this  knowledge. 

(2)  Other  data. — The  intercorrelations  of  clinical  predictions  based  on 
this  test  with  other  techniques  for  the  same  examinees  are  given  in  pre¬ 
viously  presented  tables  in  this  chapter. 

Evaluation. — From  he  data  presented,  three  conclusions  may  be 
reached  as  to  the  utility  o!  the  Observation  Stress  Test,  CE710A. 
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(a)  Subjective  predictions  of  success  in  elementary  pilot  training 
based  on  a  clinical  evaluation,  as  well  as  on  other  qualitative  observa¬ 
tions  while  the  examinee  is  under  critical  observation  in  a  stress  situa¬ 
tion,  seem  to  possess  significant  validity;  but  these  data  are  probably 
contaminated  by  indirect  knowledge  of  test  scores. 

( b )  As  in  other  clinical  procedures,  the  influence  of  the  examiner- 
examinec-observer  interaction  variables  on  the  validities  of  the  clinical 
predictions  are  undetermined.  All  ratings  were  made  by  one  examiner 
and  one  observer. 

(f )  The  possible  influence  of  objective  clock  scores  in  determining  the 
clinical  predictions  has  not  been  determined.  If  the  strong  correlation 
between  clock  scores  and  ratings  represents  a  personality  variable  rather 
than  coordination  or  skill,  then  it  might  be  better  to  use  the  objective  score 
(clock)  to  measure  it  in  place  of  the  ratings. 


Control  Confusion  Test,  CE214A  11 

This  test  embodies  essentially  the  same  characteristics  as  the  Obser¬ 
vational  Stress  Technique,  CE710A,  lacking  only  the  verbal-stress  direc¬ 
tions  which  were  administered  in  that  instrument. 

Description. — The  same  apparatus  is  employed  in  CE214A  as  in  the 
Observational  Stress  Technique.  It  is  administered  by  one  examiner  to 
one  examinee  at  a  time. 

( 1 )  Scoring. — In  addition  to  recording  the  objective  clock  scores,  rat¬ 
ings  were  made  by  the  examiner.  The  examinee  was  rated  on  a  three- 
point  scale  for  comprehension  of  total  task,  smoothness  of  operation, 
flexibility,  and  tension.  Speed  of  comprehension  was  rated  on  a  10-point 
scale.  Prediction  of  pilot  success  was  rated  on  a  nine-point  scale  with 
the  points  defined  as  follows: 


Points 

Probability! 

Verbal  itattmenl 

8 

Almost  8  chances  out  of  8 . . . 

Success  highly  probable. 

7 

Almost  7  chances  out  of  8.  . . . 

Success  very  likely. 

6 

Almost  6  chances  out  of  8.  . . 

A  Rood  bet  for  success. 

S 

Almost  S  chances  out  of  8.  . . . 

A  little,  better  than  SO-SO 
chance  of  success. 

4 

Almost  4  chances  out  of  8 . 

A  SO-SO  chance  of  success. 

J 

Almost  J  chances  out  of  8 . . . 

A  little  less  than  SO-SO 
chance  of  sue**-. 

2 

Almost  2  chances  out  of  8 . . 

A  poor  bet  for  success. 

Failure  very  likely. 

1 

Almost  1  chance  out  of  8 . . . . . 

0 

Almost  0  chance  out  of  8.  . . 

Failure  almost  certain. 

Statistical  results. — Observations  were  made  on  about  500  classified 
pilots. 

(1)  Test  reliability. — Since  each  of  the  ratings  was  made  only  once, 
by  one  examiner,  reliabilities  could  not  be  calculated.  The  basis  for  com¬ 
putation  would  be  observation  of  the  examinee  in  the  test  situation  by 

u  Developed  *1  Psychological  Research  Unit  No.  t.  Chief  contributors:  MaJ.  Qtn  U  Heathers 
and  TechySgt  James  C.  Crutnbaugh. 
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two  or  more  independent  examiners,  or  by  a  retest  procedure.  Owing 
to  the  shortage  of  personnel,  neither  procedure  was  attempted. 

(2)  Validity  cocjjicicnt  of  ratings. — Two  samples  of  classified  pilots 
in  primary  training  were  divided  for  purposes  of  validation.  The  divi¬ 
sion  was  made  on  the  basis  of  odd  and  even  calendar  days  of  the  month 
during  which  they  had  been  tested.  The  results  are  shown  in  table  24.19. 


1  ADI.E  24.19. —  Validity  coefficients  of  clinical  ratines  for  pilots  in  primary  school 
divided  on  the  basis  of  odd  and  even  calendar  days  of  the  month  for  the  Control 
Confusion  Test,  CF.214A,  using  the  graduation-elimination  criteria 


Rating 

Day 

1,  Speed  ot  comprehension 

/Odd  days 
<  Even  days 
1  All  days  . 

2.  Comprehension  of  total 
task  . 

/Odd  days 
<  Even  days 

l  All  days  . 

/Odd  days 
{  Even  days 
*  All  days  . 

4.  Flexibility  . 

/Odd  days 

4  Even  days 
*  Alt  days  . 

S.  Tension  . 

/Odd  days 
{  Even  day* 
t  All  day*  . 

6.  Prediction  of  Pilot 

Success  . . . 

(Odd  days 
•j  Even  days 
lAU  days 

N, 

P. 

M, 

SD, 

fHi 

228 

0.8S 

6.S 

5.8 

2.5 

0.15 

260 

.88 

6.3 

5.0 

2.5 

.26 

488 

.86 

6.4 

5.4 

2.5 

.20 

220 

.84 

2.9 

2.S 

1.3 

.12 

263 

.8S 

2.9 

2.1 

1.3 

.34 

4  02 

.86 

2.9 

2.3 

1.3 

.25 

229 

.84 

2.3 

2.t 

1.2 

.08 

263 

.88 

2.3 

1.6 

1.4 

.24 

492 

.86 

2.3 

1.9 

1.4 

.16 

229 

.84 

2.S 

2.3 

1.2 

.10 

262 

.88 

2.S 

1.6 

1.4 

.33 

492 

.86 

2.S 

1.9 

1.3 

.23 

229 

.84 

2.4 

2.4 

1.1 

-.01 

263 

.88 

2.6 

2.0 

1.2 

.27 

492 

.86 

2.5 

2.2 

1.1 

.13 

230 

.84 

S.6 

S.O 

2.1 

.15 

264 

.88 

S.S 

4.2 

2.3 

.31 

494 

.86 

S.6 

4.6 

2.2 

.23 

The  intercorrclations  of  the  six  ratings  are  presented  in  table  24.20. 


Table  24.20. —  Intercorrclations  of  clinical  ratings  for  pilots  in  primary  school 
based  on  the  Control  Confusion  Test,  CE214A 


Rating 

I 

II 

III 

IV 

V 

VI 

I.  Speed  of  comprehension  . 

II.  Comprehension  of  total  task  . 

III.  Smoothness  . . 

IV.  Flexibility  . 

V,  Ten-ion  . . 

VI.  Prediction  of  pilot  success  .. 

0.66 

.57 

.56 

.34 

.64 

0.66 

’.73 

.70 

.34 

.19 

0.57 

.73 

’.70 

.43 

.76 

1000 

© 

0.34 

.34 

.43 

.37 

’.55 

0.64 

.19 

.76 

.76 

.55 

•  •  •  • 

% 

Evaluation. — In  a  survey  of  the  results  obtained  with  the  Control 
Confusion  lest,  CK214A,  two  points  remain  to  be  made  in  addition  to 
the  conclusions  drawn  from  the  observation-stress  technique: 

(a)  There  seemed  to  l»e  an  over-emphasis  on  skill  in  achieving  the 
proper  setting  of  delicately-balanced  controls  on  the  basis  of  visual 
stimuli.  1  he  relatively  fine  adaptations  required  of  die  examinee  do  not 
reveal  very  full  information  upon  which  to  make  clinical  ratings. 

(h)  A  plan,  whi’eby  more  reliable  ratings  might  be  made  through 
increasing  the  scope  of  the  examiner’s  observations,  would  be  to  have  a 
series  of  trials  increasing  in  difficulty  until  a  virtual  break-down  of  per- 
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forrmncc  is  reached,  after  which  a  recovery  trial  would  he  administered. 
Clinical  ratings  based  on  such  a  performance  might  have  more  prognostic 
value. 

The  Interaction  Test,  CE425B  *• 

This  test  was  based  on  the  assumption  that  semistandardized  social 
situations  are  productive  of  significant  Itehavior  that  can  be  rated  and 
recorded  by  trained  observers.  German  military  psycholog)’  relied  heavily 
.on  behavior  in  such  situations  as  a  source  of  data  on  leadership  ability 
and  other  traits  important  in  officers.  Traits  such  as  cooperation,  re¬ 
sourcefulness,  leadership,  dominance,  social  adjustment,  and  self-confi¬ 
dence  are  those  generally  believed  to  be  assessable  in  such  situations.  It 
was  hypothesized,  further,  that  these  characteristics  arc  related  to  suc¬ 
cess  in  air-crew  performance. 

Description. — The  essentials  of  the  semistandardized,  social-situation 
test  were:  (1)  A  task  to  be  performed;  (2)  persons  with  whom  it  was 
to  be  performed ;  and  (3)  an  observer  to  record  significant  behavior.  The 
test  situation  could  have  presented  cither  a  thought  (verbal)  problem  or 
a  concrete  (overt)  task.  Exploratory  work  resulted  in  the  conclusion  that 
concrete  tasks  produce  interactive  movements  within  a  group  and  en¬ 
courage  verbalization,  while  thought  problems  encourage  silent  mental 
processes  with  less  directly  observable  behavior.  This  exploratory  work 
was  done  as  CE425A,  from  which  the  present  test  form  emerged. 

(1)  Internal  characteristics. — The  task  selected  for  solution  was  a 
modified  Wiggly  lilocks  (7)  problem.  Four  examinees  were  placed  in 
the  testing  situation  in  which  discussion  and  cooperation  were  essential 
for  a  satisfactory  problem  solution.  The  examiner  recorded  a  chronologi¬ 
cal  description  of  events  and  made  ratings  of  the  behavior  of  the  ex¬ 
aminees. 

(2)  Administration.—' Three  sets  of  nonintcrchangeable  Wiggly 
Blocks  were  placed  unassembled  in  a  standardized  fashion  on  a  2  x  3 
foot  table  and  covered  by  a  cloth.  The  blocks  were  placed  on  a  square 
table,  shuffled,  laid  out  fiat,  and  parallel,  with  the  ends  pointing  in  their 
correct  direction.  Four  examinees  were  seated,  one  at  each  side  of  the 
table.  The  examiner  stood  to  one  side  while  reading  directions  and  at 
his  starting  signal  removed  a  cloth,  disclosing  the  test  materials.  The 
instructions  required  that  the  group  plan  the  solution  to  the  problem.  No 
method  was  prescribed  for  solving  it.  Part  of  the  instructions  follow : 

*  •  ♦  task  which  you  are  to  perform  as  a  group.  This  task  consists  of  three 
separate  puzzles.  There  are  27  pieces  altogether,  which  you  are  to  put  together  as 
quickly  as  possible  to  form  three  rectangular  blocks  of  nine  pieces  each.  You  witj 
have  10  minutes.  Each  of  you  will  have  two  separate  scores:  One  based  on  the  time 
taken  by  the  group  to  finish  all  three  puzzles,  and  the  other  based  on  what  you  your¬ 
self  do.  You  may  solve  the  puzzles  in  any  way  you  wish.  Y«u  may  talk  during  this 

test 

m  Developed  at  PtychoJofteU  KekxU  Unit  No.  I.  Ckief  MOlnkUni  Cpt  Cut  Omm  aaA 
Copt.  Donald  L  Super. 
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1  I  t  iJ  _ _ 

i  ; 

! 

{  ! 
\ 

After  5  minutes  elapsed,  the  examiner  informed  the  group  that  they 
had  5  more  minutes.  Similar  comments  were  made  after  7  minutes  (“3 
minutes  left”)  and  after  9  minutes  (“1  minute  left”). 

|  Upon  completion  of  directions,  the  examiner  wrote  a  chronological 

;  history  of  the  group  performance  and  noted  behavior  of  members  of 

the  group  to  be  used  as  a  basis  for  rating  them. 

|  (3)  Scoring. — As  in  other  clinical  techniques,  described  in  this  chap¬ 

ter,  predictions  lor  success  in  air  crew  in  general,  and  as  pilot,  bombar¬ 
dier,  and  navigator  were  maJc,  each  on  a  ninc-j/oint  scale.  These  judg- 
i  ments  were  made  by  the  examiner  on  the  basis  of  ( a )  a  survey  of  the 

check  list  of  traits  which  was  used  to  rate  each  subject,  and  (6)  detailed 
descriptions  of  the  student’s  test  behavior. 

The  following  traits  were  included  in  the  check  list  and  judged  on  a 
five-point  scale:  Cooperation,  integration,  dominance,  aggression,  emo¬ 
tional  stability,  and  fertility  of  ideas.  In  addition  to  the  ratings,  qualita¬ 
tive  notes  were  made  in  the  form  of  a  chronological  description  of  events, 
in  which  was  emphasized  the  rote  of  each  individual  in  his  approach  to 
the  task,  in  its  execution,  and  in  his  relations  within  the  group.  1 

The  examinees  seemed  to  fall  into  three  general  categories:  (a)  A 
small,  number  assumed  definite  leadership,  had  fertile  ideas,  and  worked 
effectively  with  the  group,  primarily  as  an  integrating  force;  ( b )  a  small  -1 
number  worked  alone  and  were  more  destructive  than  helpful;  and  (r) 
the  largest  number  apparently  faced  the  task  calmly  and  intelligently, 
were  not  outstanding  influences  in  the  group,  but  seemed  to  be  helpful 
participants  and  occasionally  leaders  and  integrators. 

In  the  rated  predictions  for  air  crew  in  general,  for  bombardier,  navi¬ 
gator,  and  pilot  success,  those  of  category  (a)  were  given  ratings  of  6, 

7,  or  8;  those  of  category  (6),  4  or  below ;  and  (c),  4,  5,  or  6. 

Statistical  results. — Observations  were  made  on  about  600  classified 
pilots. 

(1)  Test  reliability. — Reliability  coefficients  were  not  determined. 

(2)  Test  validity. — Validation  coefficients  obtained  for  the  clinical 
predictions  are  given  in  table  2$2l. 

Tablk  2421. —  Validation  data  for  cKmcal  predictions  for  groups  of  pilots  m 
primary  training,  using  a  graduation-elimination  criterion,  based  on  the 


Interaction  Test,  CE425B 


Table  24.22. —  Validation  data  for  check  list  of  trails  employed  in  the  h)teraction 
Test,  CE425B,  for  a  group  of  pilots  in  primary  training,  graduation-elimination 

criterion 


Trait 

fM  # 

r9  - - 

Trail 

Cooperation  . . 

n 

—0.09 

-.17 

.09 

Integration  . . 

Dominance  . 

Qi 

No  analysis  was  made  of  the  qualitative  data  in  the  chronological  rec¬ 
ord  of  each  group's  performance. 

Evaluation. — From  the  data  presented,  three  conclusions  may  be 
reached  as  to  the  utility  of  the  Interaction  Test,  CE425B : 

(а)  Check-list  ratings  of  personality  traits  based  upon  observations 
of  social  behavior  in  a  small  group  of  men  engaged  in  a  semistructurcd 
group  task  are  not  valid  for  the  prediction  of  success  in  elementary  pilot 
training. 

(б)  Subjective  predictions  of  success  in  primary  pilot  training  based 
on  an  over-all  clinical  evaluation  of  the  personality  ratings,  as  well  as 
other  qualitative  observations  in  a  semistructurcd  group  task,  are  not 
valid. 

(f)  As  in  other  clinical  procedures,  the  influence  of  the  examiner 
variable  on  the  validities  of  the  check  list,  as  well  as  on  the  clinical  pre¬ 
dictions,  i;  undetermined.  It  will  be  noted  that  in  this  test  all  ratings 
were  made  by  one  examiner. 

The  Relationship  of  the  Ceneral  Appearance  of  a  Cadet 
to  Ilis  Success  in  Primary  Training 

The  purpose  of  this  experiment  was  to  validate  appearance  ratings  of 
aviation  students  against  success  in  primary  training.  The  rationale  for 
this  experiment  was  not  that  physiognomy  is  directly  related  to  success, 
but  rather  that  prejudices  might  be  operative  in  determining  success  or 
failure  in  primary  training.** 

It  was  believed  that  flying  instructors,  as  well  as  officers  conducting 
the  adaptability-rat  ing-for-military-acronnuties  interview,  have  certain 
prejudices  common  to  our  culture,  and  that  because  of  these  prejudices 
fine-appearing  aviation  students  might  be  more  likely  to  graduate  from 
flying  training  and  to  pass  the  ARMA  than  other  aviation  students. 

Description. — The  ratings  in  this  study  were  made  by  the  regular  group 
test  proctors  during  the  course  of  administration  of  the  written  classifi¬ 
cation  battery  in  the  spring  of  1943.  Ratings  were  made  only  after  the 
proctors  had  had  several  hours  in  which  to  observe  the  men  whom  they 
were  to  rate. 

*  Dree’oped  ti  an  experiment  *1  PiTchntocicil  Retearck  Unit  No.  I.  Chief  contrtkittni 
Cape  Stuart  W.  Cook,  Cape  Lk>jd  G.  Humphrey*,  Copt.  Robert  Wurpky,  and  Stafl  of  Croup 
Te»t  Section. 
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(1)  Administration „ — Ratings  were-  ma<le  on  a  three-step  scale.  Three 
independent  ratings  wore  made  by  three  observers,  for  each  examinee. 
Each  rater  had  60  examinees  to  rate  during  each  session. 

The  proctors  were  instructed  to  make  their  ratings  as  if  they  were 
selecting  new  men  for  a  fraternity.  They  were  told  to  look  at  facial  fea¬ 
tures,  complexion,  name,  hair,  bearing,  and  clothing,  in  arriving  at 
their  judgment. 

(2)  Scoring. — A  rating  of  A  was  given  an  aviation  student  whose 
general  appearance  indicated  that  he  was  among  the  most  likely  to  suc¬ 
ceed  as  a  pilot,  a  B  was  given  the  intermediate  group,  and  a  C  was  given 
those  thought  to  be  among  the  least  likely  to  succeed.  The  letter  ratings 
were  assigned  numerical  values  of  3  for  A,  2  for  B,  and  1  for  C,  and 
then  these  scores  for  each  student  were  summated  to  obtain  a  single 
value. 

Statist’cal  results. — Data  are  available  for  a  sample  of  pilots  who  took 
primary  training,  and  also  for  a  sample  who  passed  or  failed  the  adapt¬ 
ability  rating  for  military  aeronautics. 

(1)  Distribution  of  ratings. — The  observed  distribution  of  ratings  is 
contained  in  table  24.23.  The  expected  distribution,  assuming  lack  of 
correlation  between  raters,  is  also  presented  in  this  table. 


Table  24 21 — Observed  end  expected  (assuming  zero  correlations  between  raters) 
frequency  distributions  of  appearance  ratings  in  the  relationship  of  the  generat 
appearance  of  a  cadet  to  his  success  in  primary  training 


Combined  rttinfi 

Observed 

frequencies 

Expected 

frequencies 

Combined  ratines 

Observed 

frequencies 

Expected 

frequencies 

f  . . . 

93 

31 

}  . . 

570 

601 

a  . 

315 

24} 

4  . J 

262 

221 

7  . 

70S 

710 

J  ... . 

IIS 

2! 

i  . 

•44 

993 

N  tetsi  . . . 

2,90# 

2.904 

Chi-square  for  the  discrepancy  between  these  two  distributions  is 
526.6,  indicating  a  positive  correlation  bctw'xrn  ra.ings. 

(2)  Test  i  id  it  y. — An  estimate  of  the  predictive  value  of  the  appear¬ 

ance  ratings  was  made  bv  validating  the  ratings  against  the  criterion  of  „ 
success  or  failure  in  primary  pilot  training.  Employing  a  sample  of  2,228 
pilots  in  primary  training  (pt=0.S2,  M,=6.15,  M#=6.15,  and  SD,= 
1.34),  the  obt'  ined  coefficient  between  the  rating:*  and  the  criterion  was 
zero. 

(3)  Relationship  of  ratings  to  ARMA  scores. — It  was  thought  that 
appearance  might  be  a  factor  in  determining  the  ARMA  score.  The  cor¬ 
relation  between  the  two  ratings  is  presenter!  in  t?ibk  24.24. 

Hi'uluation. — The  relation  between  ARMA  scores  and  appearance  rat¬ 
ings  shows  that  the  possible  source  of  bias  which  is  present  in  any  face- 
to-face  selection  technique  has  been  here  relatively  well-controlled.  On 
the  basis  cf  this  evidence,  it  is  concluded  that  sny  validity  found  for 
face-to-face  selection  devices,  that  is,  interviews  or  body-build  measure¬ 
ment,  may  be  free  of  the  factor  of  personal  appearance. 


Table  24.24. —  Relationship  of  appearance  ratings  to  the  ARM  A  score 


Appearance  rating* 

ARMA 

Paaa 

p.n 

*0141 

9  . 

mmrm 

4 

9) 

8  . 

7 

MS 

21 

709 

55 

844 

28 

570 

■wn 

24 

242 

log 

7 

IIS 

N  .  . 

2.75 7 

ISt 

2.908 

M  . 

6.0$ 

5.7 1 

ns 

<;n 

M 

rH# . .  .a....*. . . 

It  is  clear  that  appearance  ratings  are  unrelated  to  success  in  primary 
training,  though  very  slightly  related  to  adaptability  ratings  for  military 
aeronautics.  Though  prejudices  undoubtedly  exist,  the  elimination  pro¬ 
cedure  apparently  minimizes  such  effects. 

SUMMARY  AND  CONCLUSIONS 

The  original  aims  and  assumptions  underlying  the  clinical-techniques 
project  and  the  clinical-type  procedures  were  more  comprehensive  than 
the  scope  of  the  completed  study.  For  various  reasons,  several  major 
aims,  fundamental  to  the  thorough  appraisal  of  the  clinical  approach, 
miscarried. 

The  present  study  can  afford  only  a  negative  answer  to  the  question 
of  whether  the  clinical  type  of  approach,  in  general,  can  be  of  value  to 
the  classification  program.  During  the  course  of  the  study,  emphasis 
shifted  steadily  from  the  attempts  to  develop  a  global  type  of  analysis, 
based  on  a  variety  of  clinica*  procedures,  toward  consideration  of  the 
specific  clinical  tests  individually  as  predictive  instruments  and  as  means 
for  getting  leads  for  more  objective  tests. 

In  the  techniques  most  characteristic  of  the  clinical  approach,  sum¬ 
marizing  predictions,  based  on  a  nine-point  scale,  of  success  or  failure 
in  primary  pilot  training,  constituted  the  major  datum  for  validatton- 
Thc  results  show  that  the  clinical  ratings  are  consistently  ineffective  in 
the  prediction  of  success  or  failure  in  primary  flying  training.  This 
much  gives  us  a  categorical  answer  regarding  clinical  predictions  of  this 
type,  based  upon  data  such  ns  were  used  in  this  project. 

At  least  three  sources  of  influence  bear  upon  a  clinied  rating  in  any 
single  instance:  (1)  Examiner  opinions  or  stereotypes  concerning  the 
personality  factors  associated  with  success  or  failure  in  primary  pilot 
training;  (2)  individual  interpretations  of  the  basic  data  (whether  ob¬ 
servational  or  projective  in  nature)  with  regard  to  various  areas  of 
personality;  and  (3)  subjective  (examiner)  weighing  of  the  personal¬ 
ity  characteristics  thought  relevant  in  estimating  the  cltanccs  for  success 
of  an  aviation  student,  it  should  lie  added  that  little  uniformity  can  be 
assumed  for  the  various  examiners  with  res|>ect  to  their  stereotypes, 
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their  interpretations,  and  the  subjective  weights  they  assign  to  person-  i 
ality  factors.  Large  individual  differences  existed  in  the  background, 
training,  ability,  and  temperament  of  the  examiners. 

It  should  be  emphasized  that  the  attention  given  to  examiner  differ¬ 
ences  in  this  report  underlines  a  basic  difficulty  in  applying  clinical  pro¬ 
cedures  to  a  large-scale  testing  program.  As  long  as  (1)  the  most  im¬ 
portant  datum  of  clinical  procedures  depends,  to  a  critical  degree,  on  in¬ 
dividual  insight,  intuition,  and  skill,  (2)  the  differences  among  exami¬ 
ners,  in  these  respects,  remain  difficult  to  measure  and  control,  and  (3) 
examiner  skill  remains  difficult  to  communicate  or  is  “incommunicable,,,  * 
it  will  always  be  possible  to  attribute  negative  results  to  inadequacies  in 
the  examiners. 

Another  fundamental  difficulty  is  that  of  validating  directly  the  quali¬ 
tative  clinical  interpretations.  All  attempts  to  express  these  interpreta-  ; 

tions  in  quantitative  form  inevitably  exclude  all,  or  certain,  critical  fea-  > 

tures  of  the  interrelatedness  and  interaction  of  personality  characteris-  | 

tics.  It  has  been  shown  that  clinical  predictions  of  the  type  used  in  this  ; 

study  arc  unsatisfactory  measures  for  this  purpose  in  that  they  introduce  i 

both  extraneous  assumptions  and  examiner  complications.  The  validation 
of  single  scoring  categories  (as  in  the  Rorschach)  is,  at  best,  a  super¬ 
ficial  procedure,  since  it  slights  the  basic  dictum  of  the  clinical  approach 
which  emphasizes  global  analysis  and  the  interrelationship  existing 
among  the  components  of  personality.  It  must  be  concluded  that  there 
has  been  little  success  with  attempts  to  derive  a  measure  of  the  most 
significant  features  of  the  qualitative  interpretations  (namely,  the  inter¬ 
relationship,  interaction,  and  balancing  of  personality  factors)  that  would 
be  capable  of  direct  validation  against  the  training  criterion. 

One  of  the  original  aims  of  the  project  was  to  develop  a  well-rounded 
clinical  picture  of  the  individual  by  means  of  a  case  conference  in  which 
a  final  prediction  of  success  or  failure  in  air-crew  training  would  be 
based  on  the  results  of  the  entire  clinical  battery.  A  committee  was  to 
have  made  the  predictions,  using  several  sources  of  information  all  rein¬ 
forcing  or  qualifying  each  other,  which  method  possibly  would  have  in¬ 
creased  the  reliability  of  the  prediction  itself  as  well  as  of  the  personality 
picture  constructed  from  the  clinical  materials.  The  failure  to  carry 
through  the  case  conference  necessitated  abandonment  of  another  origi¬ 
nal  aim,  namely,  to  write  case  histories  from  which  one  could,  upon  ttr  i 
ceipt  of  validation  data,  build  up  a  man  analysis  of  pilot  training.  This 
might  have  afforded  a  total  pattern  for  a  group  of  scores,  which  could  , 
have  served  to  negate  a  favorable  individual  score,  compensate  for  an 
unfavorable  score,  or,  in  general,  give  additional  meaning  to  any  indi¬ 
vidual  score. 

One  aim  of  the  project  was  fulfilled,  namely,  the  development  of  ob- 
jectivcly-so  red,  group  tests  based  on  the  principle  of  projection.  The 
following  tests  related  to  this  general  purpose  arc:  Rapid  Projection  j 
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Test,  CE711C;  Picture  Evaluation  Test,  CE712A;  Picture  Sequence 
Test,  CE713A;  Picture  Judgment  Test,  CE716A;  Restricted  Word  As¬ 
sociation  Test,  CE702A;  and  the  Enipathetic  Response  Test,  CE715A. 
In  spite  of  their  relative  objectivity,  none  of  these  instruments  validated 
significantly  against  a  graduation-elimination  criterion  in  primary  pilot 
training. 

The  final  conclusion,  then,  is  that  clinical  predictions,  summarized 
subjectively  from  the.  single  clinical  tests,  are  of  little  or  no  vaiue  for 
prediction  of  success  in  pilot  training.  A  final  answer,  however,  as  to 
the  useability  of  a  thoroughgoing  clinical  approach  for  classification  pur¬ 
poses  must  rest  on  solutions  to  the  following  problems:  (1)  The  use  of 
clinical  procedures  in  combination,  as  in  a  case-conference  technique, 
(2)  pattern  analyses  which  recognize  and  preserve  the  global  approach 
and  at  the  same  time  are  capable  of  direct  validation,  (3)  examiner 
variability,  and  (4)  the  possible  use  of  other  criteria,  such  as  combat 
performance,  which  might  disclose  some  value  in  the  use  of  clinical 
techniques  for  selection  purp' .  ?s  when  training  criteria  do  not. 
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CUAPUS  MfimiVL _ _ 

Measures  of  Specific  Traits  of 
Temperament1 


INTRODUCTION 

The  two  preceding  chapters  dealt  with  two  approaches  to  measurement 
of  temperament:  personality  inventories  and  clinical-type  procedures. 
Both  of  these  methods  employ  instruments  designed  to  reveal  informa¬ 
tion  about  a  variety  of  aspects  of  human  temperament.  The  primary  ob¬ 
jective  of  each  test  described  in  this  chapter  was  to  different,  'e  among 
individuals  and  distribute  them  on  a  single  continuum.  The  tests  dis¬ 
cussed  here,  then,  might  be  said  tc  explore  some  of  the  component  parts 
of  temperament  as  contrasted  with  the  more  general  approach  of  the 
preceding  chapters. 

It  will  no  doubt  be  obvious  to  the  reader  that  only  a  small  proportion 
of  the  identifiable  traits  of  temperament  were  explored.  The  decision  to 
explore  any  given  trait  was  made  largely  on  one  important  basis.  This 
was  the  apparent  importance  of  the  trait  to  air-crew  success.  Rationales 
for  these  decisions  are  covered  in  the  discussions  of  the  tests  or  groups 
of  tests.  In  general,  however,  the  presumption  that  a  trait  is  related  to 
air-crew  success  resulted  from  formal  or  informal  job  analysis.  Job- 
analysis  findings  that  suggested  the  areas  covered  by  this  chapter  are 
cited  in  chapter  22. 

MEASURES  OF  MASCULINITY 

The  tests  discussed  in  this  section,  in  common  with  certain  other  in¬ 
formation  tests,  were  designed  to  reveal  some  characteristics  of  temper¬ 
ament  rather  than  characteristics  of  intellect  of  the  individuals  tested. 

The  Masculinity-Femininity  Hypothesis 

Studies  of  biographical-data  and  sports-and-hobbies  items  indicated 
that  students  who  have  acquired  extensive  knowledge  of  airplays  tend 
to  be  more  successful  in  pilot  training.  Experience  in  riding  motorcycles 
and  handling  guns  also  proved  to  be  significantly  correlated  wtn  pilot 
success.  On  the  other  hand,  the  same  studies  revealed  that  items  con¬ 
cerned  with  art,  literature,  music,  and  the  like,  tended  to  yield  negative 
validities  for  the  pilot.  Speculation  concerning  these  results  suggested  the 
possibility  that  these  differences  could  best  be  explained  by  the  hypothe¬ 
sis  that  more  masculine  men  tend  to  succeed  in  flying,  while  those  in¬ 
clined  to  femininity  tend  to  fail.  Tire  possibility  of  determining  a  man’s 

•  Written  by  T  /S«t  Piul  C  THrU. 
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experience  and  interest  in  these  subjects  by  means  of  information  items 
instead  of  by  biographical  items,  which  are  susceptible  to  falsification, 
was  appealing.  The  success  of  sports-and-hobbies  information  items  lent 
promise  to  this  approach. 

Additional  evidence  in  support  of  the  masculinity  hypothesis  was  ob¬ 
tained  in  observation  of  the  performance  of  bomber  pilots  in  the  Euro¬ 
pean  theater,  which  placed  in  bold  relief  the  fact  that  the  bombei^  pilot 
needs  to  be  more  than  a  good  pilot.  Among  the  characteristics  apparently 
essential  to  the  pilot's  function  as  chief  of  the  bomber  crew  are  force- 
fulness,  aggressiveness,  and  other  attributes  allegedly  associated  with 
masculinity.  It  appeared  that  those  pilots  producing  the  best  combat  re¬ 
sults,  those  most  successfully  withstanding  combat  conditions,  and  those 
most  successful  in  handling  crews,  all  tended  to  exhibit  the  traits  and 
habit  patterns  commonly  attributed  to  masculinity. 

Since  insufficient  evidence  existed  to  establish  masculinity-femininity 
positively  as  the  determinant  of  validity  in  the  studies  cited  above,  it  ap¬ 
peared  advisable  to  test  the  hypothesis  by  the  construction  of  a  test  or 
tests  composed  of  distinctly  masculine  and  feminine  items  or  responses. 
The  validity  of  classification  of  the  items  within  this  dichotomy  would 
obviously  influence  the  results  of  such  a  study.  The  judgments  of  psy¬ 
chologically  trained  personnel  were  accepted  as  the  most  practical,  pre¬ 
liminary  index  to  masculinity  and  femininity  of  items.  Categories  of  in¬ 
formation  judged  to  be  masculine  or  feminine  by  this  method  were  listed 
and  assigned  to  the  various  forms  of  the  test,  so  that  all  items  in  a  given 
category  would  appear  in  only  one  form.  Analysis  of  results  could  then 
be  studied  by  category  if  so  desired. 

The  plan  of  research  in  the  information  area  included  the  administra¬ 
tion  of  the  test  or  tests  to  senior  students  in  high  school  for  the  purpose 
of  determining  the  typical  responses  of  the  two  sexes.  In  this  manner 
responses  could  be  designated  as  masculine  or  feminine  on  an  empirical 
basis.  Such  an  empirical  key  would  make  possible  the  desired  validation 
of  the  hypothesis  upon  which  construction  of  the  test  was  based.  Be¬ 
cause  of  the  tendency  noted  in  earlier  studies  for  the  differentiation  be¬ 
tween  masculinity  and  femininity  in  terms  of  information  to  approach 
zero  as  intelligence  increases,  it  was  planned  to  administer  a  general 
vocabulary  test  as  a  suppression  instrument  to  use  along  with  the  infor¬ 
mation  test.  At  the  time  this  account  was  written,  this  part  of  the  project 
had  not  yet  been  completed. 

General  Information  (M-F),  CE505GX4 » 

As  previously  suggested,  this  test  was  designed  as  an  expanded  ver¬ 
sion  of  the  valid  portions  in  the  biographical-data  and  the  sports-and- 
hobbies  inventories  which  appeared  to  differentiate  on  the  basis  of 
masculinity. 

•  TW  tew  fere m,  CRWSGXJ,  S,  a  end  7,  were  developed  it  PircSoloftttl  Reourck  Unit 
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674 


1 


Description. — The  information  covered  in  this  test  is  general,  in  the 
sense  that  it  is  commonly  available.  It  is  highly  selected,  however,  in  the 
sense  that  only  infomiation  judged  to  be  either  predominantly  masculine 
or  predominantly  feminine  was  used.  Information  judged  to  be  equally 
common  to  both  sexes  was  not  employed. 

(1)  Internal  characteristics.-- Form  CES05GX4  contains  100  items, 
50  masculine  and  50  feminine,  as  determined  by  subjective  evaluation. 
Five  categories  of  infomiation  judged  to  be  masculine  and  four  categories 
judged  to  be  feminine  are  included.  Table  25.1  lists  the  categories  and 
numbers  of  items  in  each. 


Table  25.1. —  Analysis  of  contents  of  General  Information  (M-F),  CE505GX4 


Masculine  items 

Feminine  items 

Category 

Number  o(  items 

Category 

Number  of  items 

Co  sties  ........  . . . . . 

|| 

ia 

Hunting  . . 

9 

it 

is 

t« 

Card  tames  . 

9 

Technical  knowledge 

6 

n 

(2)  Administration.— The  test  is  largely  self -administering  in  view  of 
its  straight-forward  informational  character.  No  special  directions  or 
sample  problems  are  employed. 

(3)  Scoring. — This  test  was  first  scored  right  (correct  answers  to 

masculine  items)  minus  wrong  (correct  answers  to  feminine  items)  with 
all  other  responses  scored  zero.  r 

Variations  of  the  test. — Several  forms  of  this  test,  differing  in  the 
categories  used,  were  constructed. 

(1)  General  information  (M-F),  CE505GX5. — Table  25.2  indicates 


the  categorization  of  items  in  this  form. 

Table  25.2. —  Analysis  of  contents  of  General  Information  (M-F),  CE505GXS 


Masculine  items 

Feminine  items 

Category 

Number  of  item* 

Category 

Number  of  items 

Explorers  and  inventors  ... 

13 

Radio  . . . 

n 

Smoking  and  drinking  .... 

12 

Literature  . . . 

33 

13 

i 

Animals  and  anakea  . 

13 

Cosmetics  . . . 

X 

(2)  General  information  (M-F),  CE505GX6. — Table  25.3  indicates 
categorization  of  items  in  this  form. 

Table  25.3. — Analysis  of  contents  of  General  Information  (M-F),  CE50SGX6 


Masculine  items 

Feminine  items 

Category 

Number  of  items 

Category 

Number  of  items 

Vt.kUa 

4 

Movies  . . 

IS 

30 

Art  appreciation  . . 

IS 

Boxing  and  wrestling . 

3 

Care  ol  clothing  . 

1 

Household  mechanics  . 

S 

Etiquette  . 

t 

« 

Baseball  . . . 

5 

Motorcycling  . 

a 
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(3)  General  information  (M-F),  CE505GX7.— Tabic  25.4  indicates 
categorization  of  ‘tons  in  this  form. 

Table  tSA.— Analysis  of  contents  of  Central  Information  (M-F),  CF50SGX7 
Masculine  Items 


Category 


Genera!  mechanics . 

Horae  racing  . 

Water  sports  . 

Boats  . . . . 

Chess  . 

Farming  and  gardening  ... 


Number  of  items 


10 

8 

r 

to 

s 

10 


Feminine  items 


Category 


Art  appreciation 
Music  composeis 
and  titles  .... 
Stenography  .... 
Clothes  . 
pToceries 


Number  of  items 


Statistical  results. — No  data  are  available. 

General  Information,  CE505GX3 

This  form  is  composed  of  items  from  exercise  3  of  forms  A  and  B 
of  the  Terman-Miles  Attitude-Interest  Analysis  Test.  Administration  of 
the  test  to  several  thousand  subjects  of  various  age  levels  over  a  period  j 
of  years  resulted  in  the  development  by  the  authors  of  an  empirical  keyj* 1 1 
differentiating  typically  masculine  and  typically  feminine  responses. 

Description. — The  information  items  contained  in  this  test  cover 
wide  range  of  subjects  and  interests,  to  which  some  of  the  responses  ai. 
more  common  to  males,  some  are  more  common  to  females,  and  a  smaller 
number  appear  to  be  approximately  equally  common  to  both  sexes, 
few  examples,  in  the  form  in  which  they  were  administered  to  Air-Foi. 
personnel,  are  presented.  Weightings  of  responses  are  indicated.  The 
indicates  a  masculine  response,  the  —  a  feminine  response,  and  0 
neutral  or  ambiguous  response.  These  symbols,  of  course,  did  not  appeal 
in  the  test  booklet. 

Ptat  is  used  for: 

+A  Fuel 
~B,  i  -v*ment 
0G  Plaster, 

0  D.  Read  making.  ^ J 

— E.  Don’t  know. 

A  buffet  is  used  for: 

0A.  Books. 

Gothes. 

Dishes. 

Food. 

Don’t  know. 


+  B. 
T  -G 
-D. 
+E. 


*Pi’  is  equ^l  to: 


—A. 

OB. 

+G 

+D. 

“E. 


06666. 

0,7853. 

1.453. 

3.1416. 

Don’t  know. 
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(1)  Internal  characteristics. — In  preparing  the  test  for  aviatio 
dents,  some  changes  were  made.  Whereas  the  original  test  allowed  for 
omission  when  the  examinee  believed  he  did  not  know  the  answer,  this 
form  provides  a  fifth,  or  li  alternative,  marked  “Don't  know."  From  the 
140  items  in  the  two  forms  of  the  Terman-Miles  test,  75  items  were 
selected  as  being  most  appropriate  for  pse  with  aviation  students. 

(2)  Administration. — The  test  is  largely  self -administering.  The  time 
allowed  for  answering  the  75  items  is  15  minutes. 

(3)  Scoring. — The  authors’  key  for  the  selected  items  was  adopted. 
This  key  includes  a  weight  of  +1  for  each  masculine  response,  —1  for 
each  feminine  response,  and  0  for  a  neutral  or  ambiguous  response,  all 
based  upon  the  empirical  findings  of  the  authors.  The  total  score  is  the 
algebraic  sum  of  the  item  response  weights. 

Statistical  results, \. — Only  item-analysis  statistics  are  available,  based 
upon  unclassified  aviation  students  tested  in  September  1944  at  Psycho¬ 
logical  Research  Unit  No.  3. 

(1)  Intental  consistency. — Based  upon  total  score,  the  highest  and 
lowest  200  (approximately  27  percent)  of  750  papers  were  removed  for 
item-analysis  purposes.  For  this  sample,  115  responses  keyed  as  mascu¬ 
line  yielded  a  mean  internal-consistency  phi  of  +0.11,  a  standard  devi¬ 
ation  of  0.09,  and  a  range  from  —0.14  to  +0.32.  Likewise,  110  re¬ 
sponses  keyed  as  feminine  yielded  a  mean  internal-consistency  phi  of 
—0.10,  a  standard  deviation  of  0.12,  and  a  range  from  —0.45  to  +0.25. 

Further  light  may  be  cast  on  the  applicability  of  the  key  by  examin¬ 
ing  the  overlap  of  phi  values  between  the  positively  and  negatively  scored 
responses.  Of  the  115  positively  scored  responses  for  which  phi  values 
were  computed,  11  yielded  negative  phis  with  total  score;  and  of  the 
110  negatively  scored  responses  for  which  phi  values  were  computed, 
21  or  about  one-fifth  yielded  positive  phi  values. 

Evaluation. — It  is  obvious  from  the  data,  therefore,  that  the  authors’ 
empirical  keys  could  not  be  highly  valid  for  the  air-crcw  candidate  popu¬ 
lation,  since  the  internal  consistency  is  relatively  low.  This  is  due  in 
part,  no  doubt,  to  the  fact  that  the  original  key  was  based  on  mixed 
samples,  whereas  the  air-crew  candidates  were  of  one  sex  and  age  group 
only.  A  careful  statistical  study  of  responses  of  the  air-crcw  candidates 
and  those  of  samples  of  female  population  of  similar  age  and  background 
should  make  possible  the  development  of.  a  valid  masculine  key.  Such  a 
study  should  be  followed  by  a  validation  of  item  responses  against  train¬ 
ing  and  combat  criteria. 

Reaction  Speed,  CE451AX1 

Since  much  earlier  research  in  masculinity-femininity  had  been  con¬ 
ducted  by  civilian  psychologists,  it  seemed  advisable  to  test  the  hypothe¬ 
sis  by  means  of  instruments  developed  by  them.  Tcrman,  Mites,  and 
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Goodcnough  had  quite  thoroughly  explored  the  word-association  area 
and  had  developed  tests  reported  to  differentiate  on  the  basis  of  mas¬ 
culinity-femininity. 

The  Terman-Miles  and  Goodenough  word-association  tests  were  ad¬ 
ministered  to  aviation  students  for  validation.  Because  of  the  differences 
between  the  mixed  civilian  populations  and  the  aviation  students,  it  was 
obvious  that  empirical  keys  should  eventually  be  developed  for  air  crew. 
Pending  validation  of  the  items  of  the  test,  however,  the  authors'  keys 
were  employed  for  item-analysis  purposes. 

Description. — This  is  Goodenough’s  Speed  of  Association  Test,  a 
word-association  test  devised  for  the  specific  purpose  of  differentiating 
masculine-like  from  feminine-like  individuals. 

(1)  Internal  characteristics. — This  test  consists  of  238  items  of  the 
Goodcnough  test  plus  12  similar  items  added  to  make  a  total  of  250. 
Each  item  consists  of  a  stimulus  word  followed  by  a  blank  space.  The 
space  is  provided  for  the  response  of  the  examinee.  The  stimulus  words 
arc  all  in  commo .  usage.  Homographs  are  employed,  since  they  offer 
greater  latitude  of  interpretation  for  the  examinee.  Typical  samples  of 
items  are  given. 

Fair _ 

Park- - 

Lead _ 

Cast  - - - 

Soa _ _ 

Bear  - - 

Bust  - - - 

(2)  Administration. — Directions  for  this  test  are  simple  but  impor¬ 
tant.  Examinees  are  instructed  to  write  in  the  space  provided  the  fisat 
won!  or  phrase  suggested  by  a  stimulus  word,  regardless  of  what  it  is. 
The  fact  that  no  right  answers  exist  is  also  stressed.  Instructions  de- 
manJ  Uiat  a  response  be  given  to  every  item.  The  time  allowed  for  the 
test  is  20  minutes,  phis  2  minutes  for  administration.  Examinees  are  not 
allowed  to  go  back  over  the  items,  even  though  they  are  finished  before 
tune  is  called.  The  time  limit  was  so  set  that  practically  nobody  finished. 

(3)  Scoring. — In  view  of  the  fact  that  free  responses  to  the  stimulus 
words  arc  secured  in  this  test,  the  scoring  is  difficult  and  time-consum¬ 
ing.  The  key  covers  approximately  150  fi-ges  and  includes  just  about 
all  possible  responses  to  the  stimulus  words,  listed  either  individually 
or  by  class.*  Each  possible  response  or  class  of  responses  is  assigned  a 
weight  on  an  1 1-point  scale  from  5F  to  5M ;  with  the  midpoint  designated 
as  A  (ambiguous  *  '  and  female  examinees  receive  different  scores 
for  the  same  res  .  ^  many  cases.  The  following  is  a  portion  of  the 
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key  for  marking  responses  to  the  word  stake,  showing  the  weights 
(1,  2,  3,  etc.)  for  the  sexes  assigned  for  each  response: 


F 

2M 

3M 

All  references  to  materials  as  wood,  iron,  etc.,  also  stick  of 
wood. 

A 

A 

Synonyms  as  stick,  post,  pote,  peg,  stick  in  ground,  etc 

A 

A 

All  references  as  meat,  food,  or  eating  (confusion  with  steak). 

2P 

2F 

Ground. 

4F 

2F 

All  references  to  games  as  horseshoes,  croquet,  play,  etc 

Final  score  on  the  test  is  the  total  masculine  less  the  total  feminine 
score,  making  high  positive  scores  masculine. 

Statistical  results. — No  statistical  data  are  available  for  this  test. 

Reaction  Speed,  CE451AX2  (Terman-Milea) 

Terman  and  Miles  utilized  both  frce-association  and  multiple-choice 
association  techniques  and  reported  results  of  the  two  methods  to  be 
similar. 

Description. — This  is  the  original  Terman-Miles  word-association 
test  with  three  stimulus-words  deleted,  on  the  ground  that  they  might  be 
objectionable  to  the  aviation-student  population. 

(1)  Internal  characteristics. — The  test  contains  117  words,  each  of 
which  is  followed  by  four  other  related  words,  lettered  A  through  D. 

(2)  Administration. — Directions  for  answering  are  very  simple.  A 
sample  item  is  presented,  and  the  examinees  are  instructed  to  look  at  all 
the  alternative  words  and  then  select  the  one  which  seems  to  go  most 
naturally  with  the  stimulus-word.  They  are  cautioned  to  answer  quickly 
and  not  think  too  long  about  any  one  word.  The  samples  given  are 
typical  of  those  appearing  in  the  test. 

njcsH  A.  blood.  B.  color.  C  meat  D.  soft 

devil  A.  dare.  B.  evil.  C  helL  D.  tempt 

needle  A.  compass.  B.  eye.  C  pine:  B.  sew. 

bunt  A.  find.  B.  gun.  C  search.  D.  shop. 

(3)  Scoring. — The  total  score  on  the  test  is  the  algebraic  sum  of  the 
weights  of  the  marked  items.  The  positively  weighted  responses  (  +  1) 
are  masculine^  and  the  negatively  weighted  ones  (“"1)  arc  feminine,  so 
a  high  positive  score  is  highly  masculine.  The  key  was  derived  empiri¬ 
cally  by  the  authors  as  a  result  of  administration  of  the  test  to  several 
thousand  male  and  female  subjects. 

Statistical  results.— Data  available  on  this  test  are  confined  to  item- 
analysis  results,  based  upon  the  responses  of  unclassified  aviation  stu¬ 
dents  tested  in  September  and  October  1944  at  Psychological  Research 
Unit  No.  3. 

(1)  Internal  consistency. — On  the  basis  of  total  masculine  score,  re¬ 
sponses  of  the  highest  and  lowest  27  percent  of  750  unclassified  aviation 
students  were  analyzed.  Results  of  this  analysis  are  given  in  table  25.5. 
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Tabu  23.5. —  Internal  consistency  data  for  Reaction  Speed,  CE451AX2,  based 
on  responses  of  750  unclassified  aviation  students 


Type  roponm 

N 

M# 

so# 

i  Range  o t  # 

reftpontes 

Uw 

High 

Masculine  . . . 

IS* 

195 

0.10 

.OS 

0.14 

.14 

-0.10 

-.so 

'  C A 

Ktmininc  . 

.M 

Untcorod  or  arobiguou*  . . . 

ss 

-.01 

.11 

-.40 

M 

It  is  apparent '  from  these  data  that  there  is  little  homogeneity  even 
among  responses  keyed  as  masculine.  It  is  also  interesting  to  note  that 
unscored  or  ambiguous  responses  yielded  a  slightly  negative  mean,  while 
those  scored  feminine  were  correlated  only  slightly  less  with  masculine 
score  than  were  the  masculine  items.  These  facts  strongly  suggest  the 
need  for  a  new  empirically  constructed  key  for  the  aviation-student  pop¬ 
ulation. 

Evaluation  of  Masculinity-Femininity  Tests 

Lack  of  validation  data  for  the  tests  in  this  section  leaves  the  results 
of  this  research  inconclusive  insofar  as  proof  or  disproof  of  the  hypoth¬ 
esis  is  concerned.4  The  information  approach  appears  to  demand  and 
warrant  much  more  thorough  research  in  order  to  secure  a  body  of  truly 
discriminating  items.  When  that  is  accomplished,  the  task  of  proving  the 
masculinity  hypothesis  still  remains.  One  important  drawback  to  this  ap¬ 
proach  is  the  constant  change  of  facts  and  significance  of  facts,  which 
necessitates  frequent  revision  of  information  items.  The  reaction-speed 
tests  are  vulnerable  in  this  respect  to  a  somewhat  less  degree.  The  fact 
that  ihe  masculine  and  the  feminine  items,  as  determined  empirically  by 
the  authors,  yielded  only  slightly  different  correlations  with  total  mas¬ 
culinity  score  suggests  that  lack  of  internal  consistency  is  a  major  handi¬ 
cap  to  the  use  of  this  method. 

MEASURES  OF  CAREFULNESS 

The  tests  in  this  area  were  conceived  and  designed  specifically  to  assist 
in  the  selection  of  men  likely  to  be  successful  in  navigation.  It  was  hy¬ 
pothesized  that  the  extreme  care  and  exactness  required  in  navigation 
would  be  displayed  in  the  performance  of  potentially  successful  candi¬ 
dates  on  navigational  job-sample  tests.  The  four  tests  in  this  area  were 
constructed  for  the  purpose  of  determining  whether  carefulness  per  sc 
is  related  to  navigation  success  and  whether  a  factor  of  carefulness  could 
be  isolated  and  identified.  They  were  administered  in  January  1945  for 
purposes  of  correlational  analysis  to  354  unclassified  aviation  students 
at  Medical  and  Psychological  Examining  Unit  No.  6.  All  the  data  that 
follow  are  based  upon  this  sample. 

It  is  important  to  note  that  the  instructions  to  all  four  tests  do  not 
stress  either  speed  or  accuracy  to  the  exclusion  of  the  other.  The  instruc- 
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tions  state  simply,  "This  is  a  test  of  your  ability  (to  plot  movements  on 
a  chart ;  or,  to  plot  a  chart ;  or,  to  read  scales)  quickly  and  accurately.” 

Directional  Plotting,  CE455A ' 

Since  the  navigator's  task  includes  a  great  deal  of  careful  measuring 
and  plotting  on  maps  and  charts,  it  appeared  that  a  test  measuring  the 
speed  and  accuracy  with  which  an  examinee  can  locate  points  and  esti¬ 
mate  directions  on  a  chart  would  be  in  order.  The  technique  employed  in 
this  test  differs  in  some  respects  from  those  employed  in  other  plotting 
tests  described  in  this  section. 


FIGURE  25.1 

CHART  USEO  IN  DIRECTIONAL  PLOTTING, 
CCA5M 


Description. — The  chart  pictured  in  figure  25.1  is  basic  to  this  test. 
The  examinee  is  given  the  coordinates  of  two  points  on  the  chart;  such 
25  +J  —  H  and  -J-H,  which  arc  marked  in  i  gurc  25.1.  The  exami¬ 
nee's  task  is  (  M  to  locate  the  positions,  but  nuke  no  mark  on  the  chart, 
and  (2)  to  determine  the  direction  of  the  second  position  from  the  first 
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in  terms  of  the  points  on  the  marginal  diagrams.  In  the  illustration,  posi¬ 
tion  — J— H  is  in  direction  J  from  point  4-J  — H.  If  an  asterisk  (*)  ap¬ 
pears  before  the  coordinates  of  the  first  position,  the  order  is  reversed,  i.  e., 
the  examinee  identifies  the  direction  of  the  first  position  from  the  second. 
Thus,  if  the  positions  in  figure  25.1  were  written  *-f  J— H  and  — J—H, 
the  correct  answer  would  be  D  rather  thao  J. 

The  test  booklet  is  printed  separately  and  has  the  problems  listed  in 
the  following  manner: 

Fir*t  Second 

point  ooint 

4.  +1  -I  +D  *-K 

7.  *“K  +N  -N  +L 

The  test  contains  two  parts,  part  I  having  21  problems  and  part  II 
having  25  problems. 

(1)  Administration. — Each  examinee  is  supplied  with  a  test  booklet 
in  which  are  inserted  a  chart  and  a  standard  15-place  answer  sheet.  Ap¬ 
proximately  5  minutes  are  consumed  in  reading  the  directions  and  doing 
the  sample  items.  Time  for  part  I  is  8  minutes,  and  for  part  II,  7  minutes. 

(2)  Scoring. — Two  scores  are  obtained,  one  for  total  number  of  cor¬ 
rect  responses  and  one  for  number  of  errors. 

Statistical  results.  ( 1 )  Distribution  statistics. — Distribution  data  were 
computed  for  right  and  wr<  .g  scores  separately.  These  data  are  given 
in  t^ble  25.6. 


Table  25.6 .—  Distribution  of  scores  of  354  unclassified  aviation  students ,  and 
reliability  coefficients  for  Directional  Plotting,  CE455A 


Score 

M 

SD 

ra 

Ri*M  . 

16.4 

6.6 

0.76 

Wronf  . 

9.7 

4.2 

t 

.56 

(2)  Reliability  coefficients. — Utilizing  the  raw  data  from  which  distri¬ 
bution  constants  were  derived,  preliminary  estimates  of  reliability  of  the 
right  and  wrong  scores  were  computed  by  means  of  Kuder-Richardson 
formula  No.  21,  and  are  given  in  table  75.6.  The  same  sample  yielded  a 
correlation  of  —0.48  between  right  and  wrong  scores. 

(3)  Factorial  composition. — The  most  significant  loadings  for  the 
right  scores  arc  in  the  visualization  (0.45),  numerical  (0.44),  space  III 
(0.42),  spatial-relations  (0.30),  and  psychomotor-precision  (0.26)  fac¬ 
tors.  The  loading  in  the  carefulness  factor  is  only  —0.03.  The  commu- 
nality  was  found  to  l>e  0.76.  Principal  loadings  for  the  wrongs  scores 
(reflected)  arc  in  the  visualization  (0.56)  and  carefulness  (0.41)  fac¬ 
tors.  The  communality  was  0.50.  For  a  fuller  picture  of  the  factorial 
composition  of  this  test,  see  appendix  B. 
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Complex  Scale  Reading,  CE454A  • 

Description. — This  test  was  designed  to  measure  the  ability  to  read 
values  from  scales  with  speed  and  precision. 

(1)  Internal  characteristics. — The  principal  instrument  in  this  test  is 
a  chart,  a  portion  of  which  is  shown  in  figure  25.2.  The  chart  is  printed 
on  a  separate  sheet,  while  the  item  descriptions  are  printed  in  a  booklet. 
The  following  are  typical  items : 


FIGURE  25.2 


SCALES  USED  IN  COMPLEX  SCALE  -  READING  TEST, 

CE454A 


Scale  Feed  value 

Item  values  on  scale 

5.  B40,  D20  G 

6.  A15,  F32.5  B 


The  examinee  s  task  is  to  locate  on  the  chart  the  points  listed  under 
scale  values,  place  a  straight  edge  across  the  chart  touching  the  two 
points,  and  read  the  value  on  the  scale  indicated  under  "Read  Value  on 
Scale.”  The  values  are  read  in  letters,  and  these  letters  arc  marked  on  a 

"Set  foolnott  5. 
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15-place  answer  sheet.  The  test  is  divided  into  two  parts.  There  are  21 
items  in  part  I  and  25  items  in  part  II. 

(2)  Administration. — A  straight  edge,  a  chart,  an  answer  sheet,  and 
a  booklet  arc  furnished  to  each  examinee.  Test  directions  and  four  sam¬ 
ple  practice  problems  appear  on  the  front  of  the  booklet.  The  directions 
require  about  5  minutes.  Examinees  arc  allotted  7  minutes  for  working 
on  part  I  and  6  minutes  for  part  II. 

(3)  Scoring. — The  two  scores  are  total  number  right  and  total  num 
ber  wrong. 

Statistical  results.  (1)  Distribution  statistics. — Distributions  of  right 
and  wrong  scores  are  given  in  table  25.7. 

Table  25.7. —  Distribution  of  scores  of  354  unclassified  aviation  students  and 


reliability  coefficients  for  Complex  Scale  Reading,  CE4.4A 


Score 

M 

SD 

ra 

Right  . 

1S.6 

4.8 

4.8 

0.58 

2.8 

.48 

(2)  Reliability  coefficients. — Preliminary  estimates  of  reliability  of  , 

scores  were  made  by  the  Kuder-Richardson  formula  No.  21  and  are  j 
given  in  table  25.7.  Rights  and  wrongs  correlated  —0.43  on  the  same  \ 
sample.  j 

(3)  Factorial  composition. — The  most  significant  loadings  for  the 
right  scores  are  in  the  numerical  (0.52),  spatial-relations  (0.33),  and  ‘ 
space  III  (0.32)  factors.  The  loading  in  the  carefulness  factor  is  only  ! 
0.05.  The  commonality  is  0.55.  The  principal  loading  for  the  wrong 
scores  (reflected)  is  in  the  carefulness  (0.57)  factor.  The  communality  • 
is  0.37.  For  a  fuller  picture  of  the  factorial  composition,  sec  appendix  B.  j 

j 

Plotting  Test,  CE452A  r  ; 

This  test  and  the  Plotting  Accuracy  test,  described  as  a  variation,  rep-  « 
resent  further  attempts  to  measure  functions  or  abilities  important  to  j 
the  navigator. 

Description. — The  directions  describe  this  test  as  a  measure  of  the  \ 
ability  to  plot  movements  on  a  chart  quickly  and  accurately.  The  chart,  j 
by  means  of  which  the  problems  arc  solved,  is  printed  on  a  sheet  sepa-  ■ 
rate  from  the  test  booklet.  ! 

(1)  Internal  characteristics. — The  test  booklet  contains  the  directions  ; 
and  a  list  of  problems  which  are  merely  descriptions  of  moves  on  the 
chart.  The  list  is  divided  into  two  parts.  Part  I  contains  problems  4 
through  25,  and  part  II,  problems  26  through  50.  The  first  three  num-  . 
lx'red  problems  are  samples  given  with  the  directions.  A  reproduction 
of  the  chart,  in  reduced  size,  is  presented  in  figure  25.3. 

•See  footnote  S. 
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FIGURE  25.3 

CHART  USEO  IN  PLOTTING  TEST, 

CE4-52A 

The  compass  rose  at  the  right  of  the  chart  is  presented  in  order  to 
eliminate,  insofar  as  possible,  the  influence  of  differentia!  training.  The 
problem  worked  out  on  the  chart  illustrates  the  type  of  item  in  the  test 
and  the  method  of  solution  to  be  employed.  The  square  marked  “X”  is 
the  starting  point  and  is  designated  as  III-14,  the  starting  location  being 
identified  with  reference  to  the  left  and  upper  marginal  keys. 

(2)  Administration. — Utilizing  the  orientation  provided  by  the  com¬ 
pass  rose,  the  examinee  is  directed  to  make  (or,  in  the  sample  problem, 
to  follow)  the  moves  listed.  The  moves  arc  to  be  visualized  but  never 
marked  on  the  chart.  The  coordinates  of  the  last  iocation  arc  the  an¬ 
swers  to  the  problem.  In  the  sample,  the  final  location  is  FKt  the  coordi¬ 
nates  being  read  now  from  the  bottom  and  right,  in  that  order.  Special 
emphasis  is  given  to  the  necessity  of  marking  both  coordinates  as  an¬ 
swers.  The  testing  time  for  part  I  is  8  minutes  and  for  part  II,  7  min¬ 
utes.  Reading  directions  and  the  like  required  approximately  6  minutes 
in  addition. 

(3)  Scoring. — As  indicated  above,  two  letters  are  marked  on  the  15- 
place  answer  sheet  for  each  item.  Because  of  the  method  of  answering, 
an  examinee  might  get  one  letter  correct  even  though  he  ended  up  in  the 
wrong  square.  It  would  thus  be  theoretically  possible  for  one  to  get  at 
least  one-half  the  total  letters  correct  without  doing  any  of  the  problems 
correctly.  In  order  to  comj>ensatc  for  this,  the  scorn,"  f«rnt«!a  RAY 
might  well  be  applied.  In  the  preliminary  explorations,  however,  rights 
and  wrongs  were  scored  separately. 

Statistical  results.  (1)  Distribution  statistics — Data  computed  for 
rights  and  wrongs  separately  appear  in  table  25.8. 
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(2)  Reliability  coefficients. — The  same  sample  yielded  the  prelimi¬ 
nary  reliability  estimates  seen  in  table  25.8,  as  found  by  the  Kuder- 
Richardson  formula  No.  21.  The  rights  and  wrongs  correlated  —0.42. 

(3)  Factorial  cow  position. — The  most  significant  loadings  for  the 
right  scores  are  in  the  numerical  (0.51),  space  III  (0.46),  spatial-rela-  ; 
lions  (0.25),  carefulness  (0.22),  and  psychomotor-precision  (0.20)  fac-  < 
tors.  The  communality  is  0.65.  Principal  loading  for  the  wrong  scores  i 


Tabu  25.8. —  Distribution  of  scores  of  354  unclassified  aviation  students  and 
reliability  coefficients  for  Plotting  Test ,  CE452A 
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(reflected)  is  in  the  carefulness  (0.59)  factor.  The  communality  is  0.39. 
For  a  fuller  picture  of  the  factorial  composition  of  this  test,  see  ap¬ 
pendix  B. 

Variation  of  the  test.  (1)  Plotting  Accuracy  Test,  CE453A* — With 
few  exceptions,  this  test  is  similar  to  the  plotting  test  just  described.  The  | 
same  kind  of  chart  is  employed,  but  three  orientation  compasses  are 
shown  as  compared  with  one  in  the  plotting  test.  The  points  on  these  arc 
marked  by  letter  rather  than  directions.  (In  each  item  the  examinee  is  | 
instructed  which  compass  to  use.)  The  task  is  similar  to  that  in  the  plot-  ! 
ting  test.  The  factorial  composition  of  this  test  is  similar  to  that  of  the 
plotting  test,  see  appendix  B. 

Carefulness  Factor  Analysis  * 

Administration  of  the  carefulness  tests  to  a  sample  of  354  aviation 
students  revealed  some  interesting  results.  It  was  first  noted  that  large 
numbers  of  wrong  responses  were  made,  with  considerable  range  and 
variability.  Although  both  right  and  wrong  scores  had  frequently  been 
obtained  on  other  tests,  the  results  of  the  carefulness  tests  first  strongly 
suggested  (1)  the  need  for  separate  statistical  treatment  of  wrong  scores, 
and  (2)  the  distinct  possibility  that  error  scores  might  be  very  different 
functionally  from  correct-response  scores. 

In  a  true  power  test,  right  and  wrongs  scores  correlate  —1.  As  the 
test  becomes  more  and  more  speeded,  the  negative  correlation  decreases, 
and  it  becomes  possible  for  rights  and  wrongs  to  differ  factorialiy.  Super¬ 
ficial  evidence  for  these  four  tests  indicated  that  the  scores  are  quite  in¬ 
dependent 

The  data — On  the  basis  of  the  evidence  and  theoretical  concepts  al¬ 
ready  outlined,  it  was  determined  to  score,  intercorrelate,  and  analyze 
the  right  and  wrong  scores  of  these  tests  separately.  In  addition  to  these 
8  variables,  I  I  classification  tests  were  also  included  in  the  matrix.  The 
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list  includes  several  apparatus  classification  tests,  because  they  showed 
such  high  correlations  with  the  carefulness  tests.  Descriptions  of  the 
selected  classification  tests  appear  elsewhere  in  this  report.  Intercorrela¬ 
tions  of  the  variables  appear  in  table  25.9.  Interpretations  of  the  care¬ 
fulness  tests  and  correlations  of  the  carefulness  tests  with  classification 
tests  are  based  on  a  sample  of  354  unclassified  aviation  students.  Inter- 
correlations  of  the  classification  tests  arc  based  upon  other  comparable 
samples  of  unclassified  aviation  students,  with  a  total  N  of  1,920.  Cen¬ 
troid  loadings  appear  in  table  25.10  and  rotated  factor  loadings  in  table 
25.11.  The  axes  of  wrong  scores  were  reversed  in  correlating  the  vari¬ 
ables,  so  that  lew  error  scores,  when  associated  with  good  performance 
in  other  variables,  would  produce  positive  correlations. 

The  factors. — Thirteen  factors  in  alt  were  extracted,  eight  of  which 
are  more  or  less  well  identifiable  as  genuine  factors.  Of  the  remaining 
five,  four  resulted  from  the  fact  that  both  right  and  wrong  scores  for  any 
given  test  were  derived  front  the  same  sample.  The  effect  of  this  was  to 
introduce  four  doublet  factors,  one  for  each  carefulness  test.  The  fifth 
factor  is  residual.  Each  common  factor  will  be  discussed,  along  with  the 
principal  tests  with  projections  on  the  factor.  No  tests  having  loadings 
below  0.25  will  be  listed  in  the  following  groups. 

Rotated  factor  I  is  identified  by  the  following  data : 
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OJI 

1 

M 

i 

Cam  pic*  Scat*  Rcmfinc  (1)  . 

.11 

S 

Plotline  (R)  . 

.51 

1* 

Arithmetic  Rnmltn  . 
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1 

Direct icaal  Plotting  (H)  . 

.44 

This  is  easily  identified  as  the  numerical  factor,  which  has  been  iso¬ 
lated  in  other  analyses.  The  weighted  averages  of  the  loadings  of  Numeri¬ 
cal  Operations  (front)  (0.78)  and  Arithmetic  Reasoning  (0.48)  in  sev¬ 
eral  analyses  make  this'  identification  positive.  This  is  one  of  the  two 
factors  in  which  significant  loadings  ap|>car  for  right  scores  of  every 
one  of  the  carefulness  tests.  The  Plotting  Accuracy  and  Plotting  tests 
both  involve  simple  counting,  in  addition  to  other  functions.  The  Complex 
Scale  Reading  test  involves  points  on  numerical  scales.  This  may  be  ration¬ 
alized  as  numerical  facility  in  the  sense  that  numbers  must  be  retained 
and  recognized  quickly.  The  task  in  the  Directional  Plotting  test  is  more 
difficult  to  identify  as  numerical.  The  best  explanation  is  probably  that 
a  sort  of  mental  counting  takes  place  in  locating  the  points,  even  though 
no  numbers  are  used  in  the  chart. 

Routed  factor  II  is  identified  by  the  f ollowinf  daU: 
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The  tests  appearing  on  this  factor  clearly  identify  it  as  the  psycho¬ 
motor-coordination  factor  previously  isolated.  Loadings  are  in  general 
agreement  with  those  in  other  analyses. 

Rotated  factor  III  is  identified  by  the  following  data: 
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This  is  an  entirely  new  factor,  which  appears  to  be 'uniquely  charac¬ 
teristic  of  the  wrong  scores  of  the  carefulness  tests,  at  least  in  this 
matrix.  Right  scores  of  Plotting  and  Plotting  Accuracy  show  stight,  but 
probably  not  significant,  loadings  (0.22  and  0.19  respectively)  with  the 
factor.  In  the  light  of  this  evidence,  the  factor  was  named  carefulness. 
Had  the  error  scores  not  been  analyzed,  the  general  conclusion  would 
have  been  that  no  new  factor  resembling  carefulness  could  be  found  in 
carefulness  tests.  To  wliat  extent  this  factor  is  common  to  error  scores 
in  other  tests  is  still  to  be  determined. 

Rotated  factor  IV  is  identified  by  the  following  data : 
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This  variable  is  difficult  to  reconcile  with  findings  of  other  analyses 
as  a  single  factor.  The  loading  of  Arithmetic  Reasoning  suggests  the  gen¬ 
eral-reasoning  factor,  but  the  loading  for  this  factor  is  larger  than  the 
weighted  average  (0-47)  for  several  other  analyses,  Reading  Compre¬ 
hension  has  a  much  smaller  average  loading  on  the  factor  (0.19)  than 
it  lias  in  this  analysis,  while  the  weighted  average  loading  for  Mechanical 
Principles  (0.34)  is  identical  with  that  obtained  here.  Weighted  averages 
of  loadings  in  the  verbal  factor  for  the  same  three  tests  arc  0.27,  0.60, 
and  0.03,  respectively.  Combining  mean  weighted-average  variances  in  the 
general-reasoning  factor  with  those  in  the  verbal  factor  gives  totals  of 
0.54,  0.63,  and  0.34  respectively  for  the  tests  appearing  significantly  on 
this  variable  in  this  analysis.  The  evidence  cited  seems  to  indicate  that 
the  factor  is  a  rough  combination  of  the  general-reasoning  and  verbal 
factors  which  are  identified  separately  in  other  analyses. 
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Rotated  factor  V  is  identified  by  the  following  data : 
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This  is  apparently  the  same  factor  that  was  previously  identified  (in 
the  analysis  of  the  December  1942  battery;  see  ch.  28)  as  the  psycho* 
motor-precision  factor.  It  is  not  difficult  to  see  the  content,  as  described 
by  the  term  psychomotor  precision,  in  most  of  these  tests. 

Rotated  factor  VI  is  identified  by  the  following  data: 
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The  loading  of  Mechanical  Principles  on  this  factor,  together  with  the 
loadings  of  other  tests,  indicates  that  this  is  the  factor  identified  in  earlier 
analyses  as  visualization.  The  fact  tiiat  Directional  Plotting,  rights  and 
wrongs,  arc  both  relatively  high  in  the  factor  is  revealing.  The  task  in 
this  test  involves  determining  direction  on  a  chart- in  terms  of  the  points 
on  a  coni|nss  rose  that  is  located  on  the  margin  of  the  paper.  The  ability 
to  visiplize  tlic  comjiass  rose-  on  the  plot  itself  would  obviously  assist  in 
getting  a  high  right  score.  Achievement  of  a  low  negative  score  is  appar¬ 
ently  even  more  a  function  of  this  ability.  A  formula  score  for  this  test 
would  yield  a  much  higher  loading  for  this  factor  tlian  lias  been  found 
in  any  test  yet  analyzed. 

It  is  probably  significant  that  the  loading  in  the  visualization  factor 
was  found  in  previous  analyses  to  be  inversely  projwrtional  to  the  speed 
demanded  in  the  task.  Interpreting  the  loading  of  wrong  scores  on  this 
factor  in  the  light  of  this  evidence  suggests  that  the  low  wrongs  scores 
arc  made  by  the  more  deliberate  but  meticulous  individuals  who  employ 
a  rather  exact  type  of  v  Utilization. 

The  moderately  low  loading  of  Rudder  Control  (0.27)  on  this  factor 
may  lie  due  to  the  necessity  of  visualizing  the  direction  and  amount  of 
nkiviiihnt  to  Ik-  made  in  order  to  correct  the  imbalance.  The  kinesthetic 
sense  is  undoubtedly  imj*»rlant  in  ibis  to>t,  hut  it  must  apparently  be 
supplemented  by  visual  imagery  of  the  relative  |*»ition  of  apparatus 
and  target.  Fail'ng  the  latter,  (l»c  examinee  may  lx-  engaging  in  continual 
trial  and  error. 

Rotated  factors  YU.  IX.  XI.  and  XIII  were  anticipated  and  rotations 
were  made  fird  to  tltosc  positions  in  order  to  facilitate  the  rotation  of 
the  oilier  reference  axes.  It  will  be  noted  that  each  of  tlicse  is  identified 
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only  by  the  right  and  wrong  scores  of  one  of  the  carefulness  tests.  These 
factors  arc  best  explained  as  resulting  front  (1)  true  nonerror  specific 
variance  and  (2)  the  correlation  of  errors  in  right  and  wrong  scores 
obtained  front  the  same  test  and  sample. 

Rotated  factor  VIII  is  identified  bv  the  following  data: 
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The  identity  of  this  factor  is  difficult  to  establish,  but  the  combination 
of  tests  appearing  with  the  factor  indicates  that  some  sort  of  spatial 
ability  is  the  common  element.  Because  the  Two-IIand  Coordination  test 
was  highest,  the  factor  was  named  space  III  (two-hand  coordination) 
until  a  more  definite  description  can  1*  achieved.  A  characteristic  of  the 
factor  may  he  tentatively  hypothesized  as  “spatial  reference."  Change  in 
location  and  speed  are  important  in  Two-Hand  Coordination;  change  of 
location  and  possibly  distance  arc  factors  in  Rotary  Pursuit;  distance  and 
direction  arc  factors  in  the  Plotting,  Directional  Plotting,  and  Plotting  Ac¬ 
curacy  tests;  Complex  Scale  Reading  involves  direction  and  location;  and 
some  problems  in  the  Mechanical  Principles  test  involve  direction  and 
possible  change  in  distance.  Although  the  loading  for  the  latter  test  may 
be  due  to  the  absence  of  the  mcchanical-cxpcricnce  factor  in  this  analy¬ 
sis,  all  these  tests  appear  to  involve  a  spatial  reference  factor.  It  cannot 
at  present  be  identified  with  space  II  (hands),  in  spite  of  the  fact  that 
the  leading  test  in  the  list  here  is  Two-IIand  Coordination,  in  which  a 
right-left  space  discrimination  is  very  apparent.  This  test  appeared  in 
the  same  analysis  with  the  Hands  test— the  Integration  Battery  analysis 
f  sec  ch.  10)— without  showing  any  space  communality  over  and  above 
that  of  space  f.  More  information  should  be  secured  before  positive 
identification  of  factor  space  III  is  made. 

Rotated  factor  X  is  identified  by  Ihc  following  data: 
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The  factors  bear  considerable  resemblance  to  the  spatial-relations  or 
space  I  factor  identified  ir.  other  analyses.  Weighted  average  loadings 
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from  several  analyses  on  the  spatial-relations  factor  for  the  Complex 
Coordination,  Two-IIand  Coordination,  Discrimination  Reaction  Time, 
and  Finger  Dexterity  tests  are  0.49,  0.41,  0.42,  0.17  respectively.  The 
much  higher  loading  of  Two-Hand  Coordination  in  this  analysis  is  difficult 
to  rationalize.  The  loadings  of  carefulness  tests  (Plotting  Accuracy  has  a 
loading  of  0.20)  indicate  that  this  factor  identifies  another  characteristic 
in  which  the  four  tests  arc  similar.  In  terms  of  the  description  of  the 
factor,  the  common  element  appears  to  be  the  locating  and  relating  of 
fixed  and  spatially  separated  points.  With  the  description  of  this  factor 
involvement  in  the  carefulness  tests  (right  scores),  we  have  evidence 
that  all  of  them  resemble  the  tests  of  Dial  Reading  and  Table  Reading 
functionally,  with  loadings  in  numerical  and  space  I  quite  comparable. 

Rotated  factor  XIII  appears  to  be  a  true  residual. 

Conclusions. — This  analysis  produced  some  new  and  useful  informa¬ 
tion.  It  is  interesting  to  note  that,  for  the  first  time,  an  entire  group  of 
printed  tests  proved  to  have  more  in  common  with  apparatus  tests  than 
with  other  printed  tests.  This  fact  gives  considerable  encouragement  to 
proponents  of  the  belief  that  many  factors  appearing  in  apparatus  tasks 
can  be  duplicated  in  printed  tests.  It  is  also  noteworthy  that  this  analysis 
produced  more  complete  identification  of  the  apparatus  tests  than  any 
one  analysis  had  previously  produced. 

Probably  the  most  significant  result  of  this  analysis  is  the  discovery 
that  analysis  of  wrong  scores  brought  to  light  an  entirely  new  factor.  This 
fact  may  have  important  implications  for  future  factor  analysis  and  test 
construction.  It  may  be  assumed  safely  that,  if  correlations  of  right  and 
wrong  scores  are  not  too  high,  a  fuller  picture  of  the  true  functions 
measured  by  a  test  can  be  obtained  by  analyzing  the  scores  separately 
than  by  analyzing  formula  scores.  The  results  also  imply  that  many  an 
error  has  possibly  been  committed  by  combining  right  and  wrong  scores 
in  the  same  formula.  Unless  the  two  are  factorially  similar  the  result 
may  be  very  different  than  had  been  intended  by  the  test  maker.  The 
finding  actually  opens  up  a  whole  area  of  research  on  the  use  and 
weighting  of  error  scores  in  printed  tests.  In  this  connection,  it  is  in 
order  to  suggest  that  if  right  and  wrong  scores  from  a  test  are  consid¬ 
ered  in  factor  analysis  or  in  composite  predictions,  they  should  be  de¬ 
rived  from  separate  forms  in  order  to  avoid  spurious  correlations. 

Although  more  investigation  should  be  made  to  confirm  the  findings 
of  this  analysis,  the  discovery  of  a  carefulness  factor  is  certainly  signifi¬ 
cant.  Nothing  is  known  as  yet  concerning  the  validity  of  the  factor  for 
air-crew  selection,  but  the  uniqueness  of  the  wrong  scores  which  identify 
the  factor  suggest  that  it  would  add  much  to  the  classification  stanines 
even  though  validity  of  the  factor  is  not  high.  The  new  space  factor 
(space  III)  is  also  a  discovery  which  e:  tends  to  some  degree  the  knowl¬ 
edge  of  apparatus  tests.  If  the  factor  proves  valid,  it  will  assist  further 
in  accounting  for  the  total  validity  of  these  tests. 
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Evaluation  of  Carefulness  Tests 

Most  of  the  results  of  the  research  in  this  area  are  covered  in  the 
concluding  statements  regarding  the  factorial  study.  Nothing  is  known, 
as  yet,  concerning  the  validity  of  these  tests  for  any  air-crew  positions. 
It  appears  logical  to  assume,  however,  that  wrong  scores,  at  least,  will 
prove  useful  in  selection  for  clerical-type  work  and  possibly  for  navi¬ 
gation. 

As  a  characteristic  of  temperament,  little  more  can  be  said  concerning 
carefulness  than  was  said  in  the  discussion  of  the  rationale  that  stimu¬ 
lated  preparation  of  the  tests.  It  may  be  added,  however,  that  we  know 
that  these  tests  identify  a  factor  of  some  kind,  and  that  the  factor  is 
presumed  to  be  carefulness  which,  as  presently  conceived,  is  a  trait  of 
temperament.  How  far  the  trait  is  generalized  to  other  test  and  job 
situations  is  yet  to  be  determined. 

MEASURES  OF  FEAR  AND  TENSION 

The  purpose  of  the  tests  in  this  section  was  to  identify  those  men 
who  are  less  likely  to  succeed  in  the  flying  situation  by  reason  of  fear 
of  physical  danger.  Traits  of  temperament,  such  as  fcarfulness,  occur 
in  varying  degrees  of  intensity  and  with  varying  degrees  of  generality 
in  different  individuals.  Such  traits  manifest  themselves  in  the  overt  be¬ 
havior,  mental  attitudes,  and  other  implicit  emotional  reactions.  Many 
expressions  of  fear  are  difficult  to  evaluate  objectively.  Among  those 
that  may  be  subject  to  objective  measurement  arc  verbally  expressed 
opinions. 

Survey  of  Aviator  Opinion,  CE604B  10 

It  appeared  logical  to  assume  that  attitudes  of  individuals  toward 
danger  wouid  be  reflected  in  their  opinions  regarding  aviation  practices, 
construction  of  planes,  methods  of  training,  and  the  like.  These  expres¬ 
sions  would  be  valid  indices  of  attitudes  only  if  no  irrelevant  motivation 
for  certain  responses  exists.  This  test  is  an  outgrowth  of  part  II  of  the 
original  Biographical  Data  Blank,  CE602A  (seech.  27). 

One  aspect  of  motivation  contributing  to  invalidity  of  test  responses 
is  the  general  social  unacceptability  of  the  exhibition  of  fear.  Because  of 
this,  it  seemed  advisable  to  present  material  in  this  survey  in  such  a  way 
that  the  examinee  would  regard  none  of  the  available  responses  as 
meriting  social  disapproval.  Careful  analysis  of  the  responses  should 
make  it  possible  to  distribute  or  rank  individuals  according  to  the  pro¬ 
portion  of  responses  identified  as  associated  with  or  symptomatic  of  fear 
of  physical  danger. 

In  line  with  these  requirements,  it  was  determined  to  construct  a  sur¬ 
vey  which  would  explore  the  cadet’s  opinions  about  methods,  policies, 

M  Developed  at  Psychological  Research  Unit  No.  J.  Chief  contributors:  Cape  S.  W.  Cook,  Col 
J.  P.  Guilford,  Cape  L.  G.  Humphreys,  and  Le  David  11.  Jenkins. 
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equipment,  and  the  like,  employed  in  military  .aviation.  An  attempt  was 
made  to  describe  the  situations  in  such  a  manner  that  the  examinee 
would  respond  in  the  role  of  an  emotionally  uninterested  observer.  Pre¬ 
sumably  this  approach  would  minimize  equivocation,  since  opinions  so 
expressed  should  appear  to  the  examinee  to  be  devoid  of  social  impli¬ 
cations. 

Description  (1)  Internal  Characteristics. — This  form  consists  of  52 
statements  about  flying.  The  student  is  required  to  indicate  his  attitude 
toward  each  statement  on  the  following  5-point  scale : 

A.  Strongly  agree. 

B.  Agree. 

C.  Undecided  or  have  no  opinion. 

D.  Disagree. 

E.  Strongly  disagree. 

He  enters  his  reaction  to  each  statement  opposite  the  appropriate  space 
on  the  separate  answer  sheet.  The  following  statements  are  typical  of 
those  appearing  in  the  test : 

In  building  military  planes  it  is  well  to  sacrifice  speed  in  order  to  add  more 
armor. 

A  course  of  parachute  jumping  should  be  given  every  student  flyer  before  he 
begins  flying  training. 

Acrobatics  should  be  reduced  to  a  minimum  in  training. 

The  quality  of  the  plane  is  the  biggest  factor  determining  a  pilot’s  success. 

(2)  Administration. — The  Survey  is  practically  self-administering.. 
Brief  directions  on  responding  and  marking  the  answer  sheet  suffice. 
Although  it  was  desired  that  all  examinees  respond  to  every  statement, 
a  time  limit  of  12^  minutes  was  established.  Students  are  urged  to 
record  their  first  reactions,  rather  than  those  resulting  from  long  and 
careful  consideration.  Most  examinees  finish  in  the  allotted  time. 

(3)  Scoring. — Owing  to  the  nature  of  the  material  involved,  responses 
do  not  fall  into  right,  and  wrong  categories.  Either  of  two  methods  of 
scoring  can  be  utilized  in  such  a  case.  Either  an  a  priori  key  can  be 
made,  or  a  key  developed  from  validation  of  the  responses  to  the  items. 
The  latter  method  was  employed.  A  relatively  large  sample  to  which  the 
Survey  was  administered  was  divided  into  two  equal  parts  (odds  vs. 
evens).  Response  validities  against  the  criterion  of  graduation  or  elimi¬ 
nation  from  primary  pilot  training  for  the  two  groups  were  calculated 
separately.  On  the  basis  of  these  validities,  keys  were  constructed.  The 
key  of  the  odd  sample  was  used  in  scoring  the  even  sample,  and  the 
even-sample  key  was  used  In  scoring  the  odd  sample.  This  was  done  in 
order  to  avoid  the  “bootstrap"  effect  of  scoring  a  sample  by  means  of  an 
empirical  key  derived  from  the  same  sample.  As  a  result  of  the  informa¬ 
tion  obtained  front  this  response  validation,  each  response  was  scored 
either  plus  1,  minus  1,  or  zero.  The  scoring  formula  is  the  algebraic  sum 
of  the  response  weights.  A  constant  of  20  was  added  to  all  scores  in  or- 
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der  to  eliminate  negative  scores.  The  key  derived  from  sample  I  con¬ 
tains  59  responses  scored  plus  1  and  52  responses  scored  minus  1.  The 
key  derived  from  sample  I!  contains  52  responses  scored  plus  l  and  59 
responses  scored  minus  1. 

Statistical  results.  (1)  Distribution  statistics. — The  two  samples  on 
which  the  empirical  scoring  keys  were  based  yielded  the  distributon 
constants  given  in  table  25.12. 


Table  25.12. —  Distribution  of  scores  on  Survey  of  Aviator  Opinion ,  CE604B, 
for  samples  of  Pilot  trainee / 


Sample 

N 

U 

SD 

I 

681 

32.3 

6.1 

11 

68S 

25.4 

5.3 

- ! _ > _ 

’  In  clasi  44K.  Tested  at  Psychological  Research  Unit  No.  3. 


(2)  Test  validity. — The  two  samples  previously  referred  to  yielded 
the  data  in  table  25.13  based  on  a  dichotomy  of  primary,  basic,  and  ad¬ 
vanced  eliminyes  plus  those  rated  as  “below-avcrage”  "  in  primary  train¬ 
ing  versus  all  others  in  the  samples  who  were  graduated  from  advanced. 
This  criterion  was  adopted  in  preference  to  the  usual  primary  pass-fail 
criterion  because  the  elimination  rate  was  low  at  this  time  and  only  by 
some  such  device  could  the  low  group  be  made  sufficiently  large  to  make 
validity  results  stable. 


Table  25.13.—  Validity  of  Survey  of  Aviator  Opinion,  CE604B,  for  prediction  9f 

success  in  pilot  training 


N, 

P. 

M. 

K 

SD, 

/»(•' 

•683 

0.74 

32.46 

31.82 

6.09 

0.06 

0.07 

•685 

.74 

25.58 

24.84 

5-30 

.08 

.0* 

1  Assuming  an  unrestricted  staninc  standard  deviation  of  3.00. 
•  In  class  44F.  Tested  at  Psychological  Research  Unit  No.  3. 


(3)  Item  validity. — Validities  of  all  responses  were  computed  for  the 
two  samples  separated  according  to  the  dichotomy  already  described.  An¬ 
other  sample  was  validated  against  the  pass-fail  criterion  in  primary 
training.  Means  and  standard  deviations  of  phi  values  arc  given  in  table 
25.14.  The  most  valid  response  of  the  five,  without  regard  to  sign,  is 
used  in  each  item  as  the  basis  on  which  these  statistics  are  given. 

Further  knowledge  concerning  the  usefulness  of  this  instrument  was 
sought  in  a  study  of  pilot  trainee  performance.  Three  items  of  the  Rat¬ 
ing  Cadet  Performance  SA-T2'*  scale  were  used  as  criteria  against 
which  the  items  of  the  Survey  of  Aviator  Opinion  were  validated  The 
three  items  are :  Item  No.  5,  relaxation  in  flight :  ability  to  relax  during 
flight  (freedom  from  tenseness);  Item  No.  9,  social  confidence:  ease 

“The**  rating!  were  mured  from  pilot  proficiency  card*.  ...... 

■  A  ratio#  Kale  u**d  in  •  landing-study  project.  Tbi*  Kale  U  not  described  In  this  report. 


with  which  he  approaches  you  to  ask  questions  and  express  his  opinion ; 
Item  No.  10,  potential  ability:  promise  he  shows  for  future  success  as 
a  pilot. 

I'or  purposes  of  the  analyses,  the  results  of  whi-'h  are  gi\en  in  table. t 
25.15,  the  group  of  140  was  split  into  the  high 'and  low  50  percent  on 
each  of  the  three  criterion  ratings  described.  The  ratings  were  made  on 
an  eight-point  scale  with  the  extremes  described  as  Least .  .  .  and  Most 
....  and  the  center  as  middle  of  the  group.  Responses  to  the  items  in 
the  Survey  of  Aviator  Opinion  varied  great’y  in  validity  against  these 
criteria,  and  the  relationships  between  validities  for  the  various  criteria 
were  not  high.  The  means  and  standard  deviations  of  the  most  valid  re¬ 
sponses  to  items  were  very  similar,  however,  as  indicated  by  the  data  in 
table  25.15.  This  similarity  is  probably  due  in  iarge  part  to  the  inter* 
correlations  among  the  criteria. 


Table  25.15. — Validity  phis  based  on  most  valid  response  to  items  of  Survey  of 
Aviator  Opinion,  CE604B,  using  rating  criteria  for  pilots  in  primary  training 


Criterion 

j 

M 

t 

Me 

SD 4 

Ranee  of# 

nl 

Low 

Rick 

Item  No,  S  la 

R<lin<  Cadet  Performance . 

1'0 

0-10 

0 .05 

0.04 

0.21 

Item  No.  9  in 

Ratine  Cadet  Performance . 

140 

.10 

si 

M 

M 

Item  No.  10  in 

Ratine  Cadet  Performance  . . . 

140 

.u 

.04 

M 

M 

The  positive  phi  values  obtained  indicate  correlation  of  Survey  of  Avi¬ 
ator  Opinion  item  responses  with  desirable  ratings  in  the  three  traits.  It 
is  significant  that  the  ranges  of  phi  values  were  not  great.  Owing  partly 
to  this  fact,  the  rank  orders  of  response  validities  based  cn  the  various 
criteria  did  not  agree  closely. 

Evaluation. — As  an  instrument  for  measuring  attitudes  of  fear  and 
caution,  this  survey  may  be  useful.  The  validities  reported  do  not  seri¬ 
ously  cast  doubt  upon  the  potential  usefulness  of  the  instrument,  since 
the  criteria  employed  were  inappropriate.  Criteria  involving  actual  man¬ 
ifestations  of  fear  might  yield  quite  different  results.  It  is  likely,  how¬ 
ever,  that  reliable  diagnosis  and  prognosis  can  be  made  only  for  those 
who  vary  extremely  from  the  norm  of  such  an  instrument.  Accurate 
prediction,  even  for  these,  appears  to  depend  upon  the  extent  to  which 
the  aspect  of  the  social  unacceptability  of  the  fear  response  can  be 
masked.  To  whatever  extent  it  is  possible  to  secure  accurate  responses 
from  the  chronically  fearful  and  over  cautious,  it  is  probable  that  the 
technique  employed  in  Survey  of  Aviator  Opinion  is  useful. 

Variations  of  the  test 

(1)  Survey  of  Aviator  Opinion,  CE604A " — This  is  the  first  form  of 
the  Survey,  which  contains  45  items  similar  to  those  described  under 

“  Set  IwimU  10. 


699 


CE604B.  The  original  nucleus  of  20  items  appeared  in  an  early  form  of 
Biographical  Data.  Form  CE604A  was  administered  to  90  superior  fighter 
pilots  and  47  superior  hcavy-bondicr  pilots  in  class  44  B,  and  an  item 
analysis  was  made.  Out  of  a  total  of  225  possible  responses  in  the  test,  193 
were  selected  by  5  percent  or  more  of  the  group.  Of  this  number,  38 
responses  yielded  phi  values  of  0.15  or  greater  with  the  fighter-bomber 
dichotomy.  Of  the  45  items,  20  yielded  phi  values  of  this  magnitude  for 
one  or  more  responses. 

(2)  Survey  of  Aviator  Opinion ,  CE604C u — On  the  basis  of  valida¬ 
tion  against  the  Rating  Cadet  Preference  items,  a  careful  inspection  was 
made  to  determine  what  types  of  items  were  most  discriminating.  Certain 
characteristics  of  opinion  were  common  to  those  who  were  rated  as  lacking 
in  confidence.  This  group  tended  to  favor  more  thorough  instruction  in 
ground  school  and  special  phases  of  flying.  They  did  not  favor  training 
that  is  dangerous.  They  favored  safety  precautions  in  Hying,  safer  planes, 
and  protection  of  cadets  from  off-duty  danger  by  rules  against  motor¬ 
cycling  and  the  like.  In  general  this  group  felt  that  fear  and  tenseness  are 
unimportant  and  can  be  overcome.  Consistent  with  this  is  the  belief  also 
expressed  that  relaxed  pilots  are  not  necessarily  good  and  that  slow 
learners  should  be  given  special  help. 

In  line  with  these  findings,  a  new  form  of  Survey  of  Aviator  Opinion 
was  constructed,  containing  60  items.  More  items  of  the  type  that  showed 
discrimination  in  the  B  form  were  constructed,  and  items  of  nondiscrim¬ 
inating  character  were  omitted.  This  form  was  administered  for  vali¬ 
dation  but  data  were  not  available  at  the  time  this  was  written. 

Stress  Resolution,  CE441A  u 

It  has  been  hypothesized  that  notmal  individuals  succumb  to  combat 
fatigue  because  of  the  abnormal  stresses  imposed  upon  them  by  combat 
conditions.  The  use  of  the  term  "normal,”  in  this  connection,  is  inexact, 
but  the  fact  remains  that  individuals  display  a  wide  range  of  reactions  to 
stress  situations,  and  no  valid  method  has  been  found  for  predicting  these 
reactions. 

The  devisers  of  this  test  set  forth  the  hypothesis  that  individuals  may 
be  placed  in  three  classes  according  to  their  reactions  to  stress  situa¬ 
tions.  In  the  first  group  arc  those  who  lwok  upon  the  opportunities  that 
a  stress  situation  has  to  offer  as  being  more  important  than  its  threat  of 
failure  or  loss.  This  group  contains  the  rough-and-ready  individual  who 
is  always  willing  to  take  a  chance. 

In  the  second  group  are  those  individuals  who  see  the  threat  of  loss 
or  failure  as  more  important  than  the  chance  of  success.  These  individ¬ 
uals  are  conservative  and  avoid,  whenever  possible,  the  necessity  of  tak¬ 
ing  a  chance. 

"DwtfaH  »«  PwSoloficU  Rcuarck  Uait  No.  J.  Chief  contributor:  Cpl  Harold  H.  KoBor. 

“Drroloptd  at  ParcSoUfKal  Rtocarek  Uait  No.  I.  Chief  contributor:  T/S*t  Ralph  H. 
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The  third  group  consists  of  those  middle-of-the-road  persons  who 
have  no  strong  predilection  to  either  seek  or  shun  the  chance  situation. 
The  direction  of  their  motivation  fluctuates  about  a  mean,  where  pos¬ 
sible  advantage  in  the  chance  situation  is  evenly  balanced  in  their  think¬ 
ing  against  possible  disadvantage. 

The  hypothesis  held  that  the  second  group  should  be  more  susceptible 
to  combat  fatigue  by  reason  of  the  conflict  created  by  the  military  situa¬ 
tion.  The  military  mores  encourage  and  laud  the  taking  of  dangerous 
and  often  costly  risks,  while  the  bent  of  the  individual  is  toward  careful, 
conservative  conduct.  Ambivalence  results,  since  the  individual  wants  to 
do  what  is  socially  approved  but  is  emotionally  unsuited  to  such  action. 

Description. — The  designers  of  this  test  attempted  to  describe  situa¬ 
tions  in  such  a  manner  that  the  examinee  would  have  occasion  to  demon¬ 
strate  the  extent  to  which  his  choices  arc  influenced  by  considerations  of 
absolute  security  in  preference  to  precarious  opportunity. 

(1)  Internal  characteristics.— Part  I  of  the  test  consists  of  five  prob¬ 
lems,  each  of  which  includes  seven  items.  The  following  sample  illus¬ 
trates  the  method  of  presentation  and  content  of  this  part 

You  have  just  been  assigned  to  a  new  job  as  an  assistant  instructor  in  assembling 
and  disassembling  machine  guns.  You  have  seen  the  others  work  with  the  guns  but 
have  not  had  a  chance  to  actually  handle  this  type  of  gun  yourself.  You  receive  a 
phone  call  asking  for  someone  to  demonstrate  the  gun.  You  do  not  know  when  the 
regular  instructor  will  return. 

In  each  of  the  following  circumstances  if  you  would  go  ahead  and  try  to  demon¬ 
strate  the  gun  yourself,  even  though  you  know  you  are  not  prepared,  blacken  the 
space  under  A.  If  you  are  not  sure  what  you  would  do,  blacken  B.  If  you  would  try 
to  get  out  of  it  by  asking  them  to  wait  until  the  regular  instructor  returned, 
blacken  C.  The  demonstration  is  to  be  before: 

22.  A  group  of  buck  privates. 

23.  A  group  of  commissioned  officers. 

24.  A  group  of  noncommissioned  officers. 

Part  II  contains  ten  statements  of  opinions  or  principles  regarding 
luck  and  chance.  An  excerpt  from  this  part,  including  directions  for 
answering  on  a  five-point  scale,  follows: 

If  you  strongly  agree  with  one  of  the  following  statements,  blacken  the  space  be¬ 
neath  A.  If  you  agree,  blacken  space  B.  If  you  aren’t  swe,  blacken  G  If  you  dis¬ 
agree,  blacken  D.  If  you  strongly  disagree,  blacken  E. 

In  everyday  life  situations: 

36.  A  person  who  trusts  to  luck  will  be  more  successful  than  one  who  doesn’t. 

37.  Taking  a  chance  is  a  bad  thing. 

38.  A  person  should  leave  well  enough  alone. 

39.  If  a  person  trusts  to  luck,  he  is  not  using  his  head. 

40.  When  a  person  takes  a  chance,  he  has  everything  to  lose  and  nothing  to 

gain. 

Part  III  consists  of  12  information  items.  Ten  of  these  items  con¬ 
tain  fictitious  names  or  information,  so  no  right  answer  is  possible.  For 
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face  validity,  two  arc  authentic;  c.  g.,  item  46  in  the  following  test  ex¬ 
cerpt: 

In  the  following  examination,  if  you  believe  the  correct  answer  is  A,  blacken  the 
space  under  A.  If  you  believe  the  correct  answer  is  B,  blacken  the  space  under  B. 
If  you  are  not  sure  of  the  correct  answer,  and  do  not  wish  to  guess,  blacken  the 
space  under  C 

You  are  given  10  points  to  start  with.  For  each  correct  answer  you  will  receive  an 
additional  point.  For  each  incorrect  answer  you  will  lose  a  point.  If  you  are  not  sure 
and  do  not  wish  to  guess,  your  score  will  not  be  affected  because  marking  space  C 
does  not  influence  your  score. 

46.  The  experiments  of  Wilbur  and  Orville  Wright  were  carried  out  at: 

A.  Kitty  Hawk.  B.  Pelican  Bay.  C.  Not  sure. 

47.  The  term  "dihedral”  was  first  used  in  a  book  published  by: 

A.  Captain  H.  A.  Smith.  B.  Captain  W.  J.  Bowles.  C  Not  sure. 

(2)  Administration. — Initial  test  instructions  arc  simple  and  short. 
As  indicated  in  the  sample  items  given,  the  method  of  marking  is  ex¬ 
plained  at  the  beginning  of  each  problem  or  section. 

(3)  Scoring. — In  the  absence  of  an  empirical  key  or  weighting  sys¬ 
tem,  subjectively  determined  weights  were  assigned  to  all  responses  on 
a  five-point  scale.  Responses  that  indicated  greatest  desire  for  security 
were  given  weights  of  1,  while  those  indicating  least  consideration  for 
security  were  assigned  weights  of  5.  Intermediate  responses  received 
weights  of  2,  3,  or  4. 

Results  and  evaluation. — This  test  was  administered  to  a  sample  of 
1,087  unclassified  aviation  students  in  July  and  August  1944  at  Psy¬ 
chological  Research  Unit  No.  1.  Separate  scores  for  the  three  parts 
were  derived  and  intcrcor related.  Part  I  and  part  II  showed  consider¬ 
able  correlation,  but  part  I-part  III  and  part  II-part  III  correlations 
were  not  significantly  greater  than  zero.  Unfortunately,  data  are  not 
available  on  the  validity  of  the  test  for  prediction  of  air-crew  success. 
An  even  more  interesting  study,  the  validation  of  prediction  of  sus¬ 
ceptibility  to  combat  fatigue,  should  be  done.  Only  if  these  or  some 
similar  data  arc  available,  can  the  degree  of  correctness  of  the  original 
hypothesis  be  determined. 

Evaluation  of  Measures  of  Fear  and  Tension 

Usefulness  of  expressions  of  opinion  and  attitude,  as  employed  in 
these  tests,  apjiears  to  be  limited.  Results  obtained  from  administration 
of  the  Stress  Resolution  test  anti  other  similar  tests  led  to  an  appreciation 
of  the  limitations  of  subjectively  derived  scoring  keys.  Low  correlations 
with  training  criteria  achieved  by  the  Survey  of  Aviator  Opinion  may 
indicate  that  responses  do  not  have  the  same  significance  for  all  exam¬ 
inees.  The  validity  and  reliability  of  interpretation  of  these  responses  by 
both  examinees  and  psychologists  stem,  therefore,  to  be  especially  sig¬ 
nificant  in  this  type  of  test. 

If  valid  interpretation  of  responses  were  achieved,  however,  the 
question  of  the  significance  of  the  response  for  air-crew  success  still 
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remains  unanswered.  If  specific  f:*ars  are  predictive  of  failure  in  train¬ 
ing,  a  measure  such  as  Survey  of  Aviator  Opinion  might  be  effective  in 
identifying  those  likely  to  fail.  If,  on  the  other  hand,  the  generality  only 
of  the  fear  response  is  predictive,  more  adequate  instruments  would  no 
doubt  be  required.  It  appears,  then,  that  fear  and  tension,  either  general 
or  specific,  may  be  related  to  air-crew  success,  but  that  analysis  of  results 
obtained  from  the  tests  in  this  area  reveals  no  conclusive  evidence. 

MEASURES  OF  CONFIDENCE 

Field  studies,  as  well  as  casual  observation  by  both  psychologists  and 
laymen,  indicate  that  important  differences  exist  in  the  attitudes  of  in¬ 
dividuals  as  they  attack  new  problems  or  tasks.  Certain  individuals  enter 
upon  such  new  experiences  with  zest  and  confidence,  while  others  dis¬ 
play  considerable  trepidation.  It  is  assumed  that  between  these  extremes 
lies  the  majority  of  individuals  who  display  less  marked  reaction  to 
coping  with  new  situations.  Although  empirical  evidence  is  not  available, 
it  appears  from  observation  that  individuals  tend  to  establish  a  uniform 
pattern  of  reaction  toward  such  new  problem  situations.  These  facts 
suggest  that  a  measure  of  the  confidence  with  which  individuals  ap¬ 
proach  tasks  might  be  useful  in  the  selection  of  trainees. 

Indices  of  Self  Confidence,  CE427A  " 

It  was  hypothesized  that  confidence  should  be  measured,  not  in  terms 
of  the  excellence  of  performance  forecast  by  the  examinee,  but  rather 
in  terms  of  the  extent  to  which  the  forecast  differs  from  actual  per¬ 
formance. 

It  was  hypothesized  that  such  a  measurement  would  bear  some  rela¬ 
tionship  to  air-crew  training.  In  general,  it  was  supposed  that  the  more 
realistic  individuals  would  prove  more  capable  in  air-crew  positions.  It 
was  felt  that  those  erring  greatly  in  their  prediction  of  performance 
would  tend  to  make  similar  errors  of  judgment  in  flying  situations, 
which  would  result,  perhaps,  in  low  efficiency,  limited  success,  or  even 
in  extreme  instances,  death  to  themselves  and  others. 

It  was  decided  that  prediction  of  psychomotor  scores  could  be  used. 
Since  the  candidates  generally  are  unfamiliar  with  the  apparatus,  the 
factors  of  experience,  learning,  etc.,  would  largely  be  avoided.  Of  fur¬ 
ther  advantage  would  be  the  fact  that  little  additional  time  and  no  addi¬ 
tional  tasks  are  required  in  obtaining  scores  by  this  method. 

Description. — As  suggested  in  the  previous  paragraphs,  this  measure 
is  not  a  test  in  the  usual  sense  but  is  rather  an  indication  of  the  attitude 
toward  and  evaluation  of  new  tasks. 

(1)  Internal  characteristics — A  10-point  scale  for  rating  perform¬ 
ance  was  constructed,  ranging  from  9-10,  very  good,  to  1-2,  very  poor. 

m  DfffUpH  tt  XtKirtk  Unit  H*.  I.  Ckitf  •  Li.  Gtvfft  1. 
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The  midpoint  is  average,  point  5  is  just  below,  point  6,  just  above  aver* 
age,  poor  is  3-4,  and  good  is  7-8. 

(2)  Administration. — Two  rating  scales  of  the  type  described  were 

provided  each  student  for  each  of  the  six  psychomotor  tests  in  the  classi¬ 
fication  battery.  Upon  entering  the  testing  room,  the  candidate  was  re¬ 
quired  to  indicate  on  one  of  the  scales  the  level  of  expected  performance 
in  the  test  he  was  about  to  take.  After  taking  the  test  and  without  knowl¬ 
edge  of  the  score  obtained,  the  candidate  was  required  to  indicate  orr 
the  second  scale  what  he  thought  his  level  of  performance  had  been. 
While  this  second  rating  was  being  made,  the  candidate  did  not  have 
opportunity  to  refer  to  his  first  rating  for  purposes  of  comparison.  He 
took  the  test  simultaneously  with  three  other  students  in  the  same  room  !> 

and  may  have  gained  some  impression  of  his  relative  success.  His  per-  f 

forma  nee  and  relative  status  on  earlier  tests — insofar  as  he  could  appre¬ 
ciate  it — also  may  have  had  some  bearing  on  his  predictions  in  later  tests,  i 

(3)  Scoring. — Difference  scores  were  obtained,  based  upon  the  abso¬ 
lute  magnitude  of  the  discrepancies  that  existed  between  each  estimate 
— estimate  1  (pro  test)  and  2  (post-test) — and  the  actual  performance. 

In  order  to  make  comparisons  of  ratings  and  performance  possible, 
standard-score  norms  for  the  points  on  the  rating  scale  were  estab¬ 
lished.  Test-performance  standard-score  norms  were  also  computed. 
Difference  scores  for  pre-test  estimates  and  for  post-test  estimates  were 
transmuted  into  a  nine-point  distribution,  a  scale  value  of  one  indicating 
little  or  no  deviation  from  the  estimate  and  nine  indicating  extreme  de¬ 
viation.  These  12  scores  (2  for  each  of  6  psychomotor  tests)  were  used 
for  validation  purposes.  , 

Statistical  results. — Statistical  results  for  this  test  are  confined  a!-  j 
most  entirely  to  small-sample  validation  of  the  difference  scores  just 
described.  The  samples  were  tested  in  February  1944  at  Psychological 
Research  Unit  No.  t. 

(1)  Test  validity. — Both  pre-test  and  post-test  difference  scores  were 
validated  against  the  pass- fail  criterion  in  primary  pilot  training.  Of 
interest  also  arc  the  correlations  obtained  between  the  difference  scores 
and  standard  scores  on  the  psvehomotor  tests.  These  data  are  given  in 
table  25.16. 

Evaluation. — Several  tentative  conclusions  can  be  drawn  from  the 
data  presented  in  table  25.16.  For  this  sample,  it  is  apparent  that  elimi* 
nees  are  on  the  average  less  accurate  in  both  predicting  and  evaluating 
their  performances  in  the  psychomotor  tests  used.  This  fact  may  suggest 
that  those  successful  in  training  are  more  realistic  with  respect  to  their 
abilities.  With  the  exception  of  aiming  stress,  the  partial  correlation 
(holding  test-score  constant)  between  difference  score  and  criterion  was 
larger  for  the  post-test  than  for  the  pre-test  ratings.  This  fact  suggests 
that  the  characteristic  of  confidence  takes  on  more  significance  as  the 
individual  becomes  better  oriented  in  the  field  of  performance.  The 


704 


Ta iu  2S.16. — Correlations  of  difference  scores  on  Indices  of  Self -Confidence, 
CE427A,  with  various  criteria,  based  upon  a  sample  of  pilots 
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realistic  individual  acquires  a  better  basis  for  his  judgment,  while  the 
unrealistic  person  probably  changes  little  as  a  result  of  his  added  knowl¬ 
edge.  The  magnitude  of  the  validity  figures  with  performance  scores 
partialed  out  suggests  that  further  investigation  should  be  made,  al¬ 
though  none  of  the  validities  arc  significant  at  the  1  percent  level  and 
only  five  arc  significant  at  the  5  percent  level.  On  the  basis  of  the  pre¬ 
liminary  data  presented,  difference  scores  obtained  in  this  manner  would 
have  added  considerably  to  the  predictive  value  of  the  composite  pilot- 
aptitude  score. 

Self-Crediting  Mental  Abilities,  CE429A  1T 

As  the  title  indicates,  this  test  was  designed  to  measure  confidence  in 
mental  rather  than  physical  abilities. 

Description. — This  test  consists  of  five  parts,  in  which  various  types 
of  tasks  arc  provided.  The  content  of  the  items  is  utilised  only  as  a 
means  of  obtaining  confidence  scores. 

(1)  Internal  characteristics. — Part  I  of  the  test  presents  12  informa¬ 
tion  items  with  four  alternative  answers  to  each  and  three  alternative 
responses  to  indicate  the  confidence  with  which  the  examinee  answers 
each  question.  The  following  item  is  typical  of  those  in  this  part: 

A  barometer  measures: 

A  Air  pressure. 

B.  Distance. 

C  Electricity. 

D.  Time. 

M.  Certainly  correct 

N.  Probably  correct 

O.  Doubtful. 

Part  II  contains  12  five-alternative  items  calling  for  logical  selection. 
The  following  is  a  sample  of  items  in  this  part: 

An  official  always  has : 

A  A  badge. 

B.  Duties. 

C  Right1;. 

D.  A  salary. 

E.  A  uniform. 

M.  Certainly  correct 

N.  Probably  correct 

O.  Doubtful 

In  answering  the  items  in  part  II,  the  examinee  is  required  to  mark 
the  two  correct  answers. 

Part  III  contains  12  items  in  which  the  examinee  must  indicate  which 
one  of  5  alternatives  docs  not  belong  in  the  list  because  it  is  unlike  the 
others.  The  following  sample  is  typical: 

«  Developed  at  Psychologic*!  Research  Unit  No.  I.  Chief  contributors:  Pfc.  Vernon  VV.  Grant, 
Lt.  Llewellyn  N.  \V1le7. 
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A.  Democrat 

B.  Methodist 

C.  Republican. 

D.  Tory. 

E.  Whig. 

M.  Certainly  correct. 

N  Probably  correct 

O.  Doubtful 

Part  IV  contains  12  four-alternative  analogy  items.  The  following 
item  is  typical: 

Seldom  is  to  never  as  little  is  to : 

A.  Small 

B.  None. 

C  Large. 

D.  Often. 

M.  Certainly  correct 

N.  Probably  correct 

O.  Doubtful  -v 

Part  V  contains  12  four-alternative  number  series  items  similar  to 
the  following: 

16  17  IS  18  14  19 . 

A.  13  21. 

B.  13  23. 

C  13  20. 

D.  122a 

M.  Certainly  correct 

N.  Probably  correct 

O.  Doubtful 

The  examinee’s  task  is  to  select  the  numbers  that  will  carry  on  the  series 
in  the  sequence  established  by  the  numbers  listed  in  the  problem. 

(2)  Administration. — The  examinee  is  instructed  to  answer  an  item 
and  then  indicate  the  strength  of  his  confidence  in  the  correctness  of  the 
answer  by  filling  in  space  M,  N,  or  0.  Fiftccn-placc  answer  sheets  are 
employed  for  the  test.  The  examinee  is  informed  that  if  he  answers 
M — Certainly  correct,  he  will  receive  3  points  credit  if  the  answer  is 
correct,  but  will  be  penalized  3  points  if  it  is  wrong.  N — Probably  cor¬ 
rect  is  weighted  2,  and  O — Doubtful,  1. 

(3)  Scoring. — The  scores  actually  used  in  evaluating  results  of  the 
test  were  based  upon  the  number  of  M,  N,  and  O  ratings.  In  this  way 
it  was  intended  that  absolute  knowledge,  ability,  and  the  like  would  be 
eliminated  from  the  score. 

Results  and  evaluation. — Preliminary  analysis  of  the  scores  revealed 
that  approximately  85  percent  of  the  responses  were  in  the  M,  or  most 
confident,  category.  The  O,  or  doubtful,  category  was  marked  in  only 
approximately  5  percent  of  the  responses.  These  data  were  interpreted 
as  indicating  that  the  material  was  much  too  easy.  The  15  percent 
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selecting  less  than  the  most  confident  rating  was  obviously  too  small  a 
proportion,  and  the  effective  range  of  confidence  too  limited  to  allow  for 
reliable  validation  of  the  hypothesis.  If  more  work  is  done  on  this  te$t, 
the  categories  of  assurance  should  probably  be  revised,  adding  a  still 
more  positive  statement  of  confidence  and  rewording  other  statements. 

The  list  might  then  read  as  follows : 

L.  Unquestionably  correct  , 

M.  Almost  certainly  correct. 

N.  Probably  correct 

O.  Possibly  correct. 

Although  the  intervals  in  this  scale  are  not  equal,  each  indicates  a  I 
different  degree  of  confidence.  It  is  possible  that,  by  the  use  of  a  scale 
such  as  that  suggested,  a  better  distribution  of  expressions  of  confidence  j 
would  be  achieved.  1 

Quantitative  Estimation,  CE440A  11 

It  was  reported  that  failure  in  flying  training  is  frequently  caused  fcy 
anxiety,  as  expressed  in  lack  of  confidence,  indecisivencss,  and  in  other  ;  j 
symptoms  in  this  area.  The  habitual  behavior  of  the  individual  in  maje-  ’j 
ing  decisions  should  reveal  some  information  about  his  tendency  toward  i 
anxiety.  It  was  hypothesized  that  if  the  individual  were  given  some  j 
simple  task,  of  the  five-alternative  multiple-choice  variety,  in  which  cor¬ 
rect  answers  could  be  estimated  only,  the  tendency  to  anxiety  and  inse¬ 
curity  would  manifest  itself  in  indecision  and  lack  of  confidence  in  his 
answers.  This  might  be  measured  by  having  the  examinee  rate  his  con¬ 
fidence  in  his  answers  in  some  way. 

Description. — It  was  decided  that  the  desired  measure  could  best  be 
obtained  by  providing  for  several  choices  by  the  examinee.  The  number 
of  choices  or  guesses  he  took  should,  therefore,  be  a  measure  of  his 
surcncss  of  the  right  answer. 

(1)  Internal  characteristics. — This  test  consists  of  three  parts.  Each 
item  in  the  test  has  three  -numbers,  so  the  examinee  will  have  three 
spaces  on  the  answer  sheet  and  can  make  three  guesses  if  he  desires. 

Part  I  contains  30  items  in  which  the  examinee  is  required  to  select  the 
correct  proportions  of  familiar  objects.  The  samples  in  figure  25.4  are 
typical  of  the  items  in  part  I. 

Part  II  consists  of  30  items  in  which  the  examinee  is  required  to  i 
select  the  figure  which  has  the  largest  or  the  smallest  area  from  a  group  [ 
of  five  figures  of  various  shapes.  Figure  25.5  gives  items  typical  of  1 
part  II. 

Part  III  contains  15  items  in  which  the  examinee  is  required  to  select 
the  answer  that  correctly  describes  the  size,  weight,  capacity,  or  the  like, 
of  familiar  objects.  Following  arc  samples  of  the  items  in  this  part: 

190-191-192.— The  number  of  regulation  baseballs  which  would  weigh  5  lbs.  is: 

(l)  16  (2)  12  (3)  8  (4)  10  (5)  20.  » 

"Developed  at  Psychological  Research  Unit  No.  I.  Chief  contributor:  T/Sgt  Louis  Delmaa. 


708 


1-2-3  Life  magazine. 


4-5-6  One  ooular  eiuu. 

□  CD  [ 


i 


FIGURE  254 

SAMPLE  JTEMS  OF  PART  I  OF  QUANTITATIVE  ESTIMATION, 

CE440A 


91-92-93 

The  fioure  which  has  the  LARGEST  area  is 


FIGURE  25.5 

SAMPLE  ITEMS  OF  PART  H  OF  QUANTITATIVE  ESTIMATION 

CE440A 


70J32G-47— W 


709 


193-194-195. — The  weight  of  two  empty  coca-cola  bottles  is: 

(1)  28  oz.  (2)  32  oz.  (3)  24  oz.  (4)  20  oz.  (S)  36  oz. 

202-203-20-4. — The  maximum  number  of  nickels  which  can  be  placed  flat  upon  the 
surface  of  a  dollar  bill  is: 

(1)  21  (2)  32  (3)  28  (4)  18  (5)  24. 

184-185-186. — The  maximum  number  of  passengers  which  a  standard  railroad 
coach  is  built  to  seat  is: 

(1)  70  (2)  80  (3)  90  (4)  50  (  5)  6a 

(2)  Administration. — According  to  the  directions  in  the  booklet, 
parts  I,  II,  and  III  were  to  be  timed  separately,  and  examinees  were 
instructed  not  to  proceed  to  another  part  until  the  signal  was  given.  In 
actual  administration,  however,  these  directions  were  disregarded  and 
an  over-all  time  limit  of  35  minutes  imposed. 

(3)  Scoring. — The  examinee  was  instructed  that  if  he  gave  one  an¬ 
swer  only  and  it  was  correct,  he  would  receive  6  points;  if  he  gave  two 

answers  and  one  was  correct,  he  would  receive  4  points ;  and  if  he  gave 

three  answers  and  one  was  correct,  he  would  receive  2  points.  AH  in¬ 
correct  answers  were  to  count  zero.  For  validation,  however,  the  num¬ 
bers  of  single,  double,  and  triple  responses  were  the  only  scopes  used. 

Statistical  results. — Preliminary  statistics  only  were  obtained  con¬ 
cerning  proportions  making  one,  two,  and  three  responses  and  the 
interrelationships  of  these  data.  From  the  answer  sheets  of  a  reportedly 
large  number  of  preflight  individuals  tested  at  Psychological  Research 
Unit  No.  1  (classification  not  identified),  200  were  selected  at  random 
for  analysis.  Analysis  of  these  results  showed  that  based  on  the  total 
number  who  tried  an  item,  on  the  average,  62  percent  gave  one  answer, 
33  percent  gave  two,  and  5  percent  gave  three  answers.  These  percent¬ 
ages  were  computed  from  data  on  the  first  50  items  of  the  test.  Due  to 
the  shortness  of  the  testing  time,  the  number  of  individuals  answering 
items  decreased  considerably  after  item  50.  For  this  reason  it  was  de¬ 
termined  that  the  first  50  only  should  be  used  in  scoring  the  test  for 
validation. 

(1)  Test  validity. — Validation  results  based  on  one  sample  are  given 
in  table  25.17. 


Tabu  25.17. —  Validity  of  Quantitative  Estimation,  CE440A,  based  upon  a  sample 
of  pilots  in  primary  training ,'  with  the  graduation-elimination  criterion 

[N,=556,  P'=0.(9) 
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•  Trttrd  in  June  1944  at  Paycholoiical  Rrtrarch  Unit  No.  1. 
'Auunini  in  unreitrictrd  iluiint  standard  deviation  o t  2.00. 


Evaluation. — As  evidenced  by  the  validities  of  various  scores  for 
pilot  success,  this  test  did  not  justify  the  expectations  set  forth  in  die 
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hypothesis.  It  is  possible,  however,  that  an  instrument  of  this  type  might 
predict  success  in  tasks  affected  more  directly  by  confidence.  On  the 
other  hand,  examination  of  parts  (  and  II  strongly  suggests  that  these 
parts  measure  some  perceptual  function  rather  than  confidence.  Percep¬ 
tion  and  memory  may  well  be  predominant  also  in  part  III. 

Behavior  Preference  Questionnaire,  CE432A  *• 

This  questionnaire  is  an  attempt  to  isolate  one  aspect  of  personality, 
namely,  degree  of  self-confidence  in  social  situations. 

Description. — This  test  is  of  the  multiple-choice,  personal-inventory 
type. 

(1)  Internal  characteristics. — The  questionnaire  consists  of  40  items, 
each  briefly  describing  a  social-situation  problem  and  presenting  four 
alternate  methods  of  solution.  Different  degrees  of  self-confidence  pre¬ 
sumably  are  indicated  by  the  various  choices.  For  example: 

You  are  up  before  a  military  board  for  an  interview,  and  the  head  of  the  board 
mispronounces  your  name.  What  would  you  do? 

A.  Wait  until  he  finishes  speaking,  then  correct  him. 

B.  Correct  him  at  once  politely. 

C.  Say  nothing,  since  it  is  probably  not  important 

D.  Wait  until  the  end  of  the  interview,  then  tell  him. 

You  are  waiting  in  line  to  buy  a  theater  ticket  and  a  man  pushes  his  way  in  just 
ahead  of  you.  What  would  you  do? 

A.  Give  him  a  push  out  of  line. 

B.  Tell  him  to  go  to  the  end  of  the  line. 

C.  Comment  to  those  near  you  in  line  about  the  gall  of  certain  people. 

D.  Do  and  say  nothing. 

(2)  Administration. — The  questionnaire  is  administered  as  a  group 
test  with  a  time  limit  of  15  minutes.  The  test  is  paced  by  the  administra¬ 
tor  by  announcement  of  the  time  at  the  conclusion  of  5,  10,  and  13  min¬ 
utes,  in  an  attempt  to  assure  completion  of  a  maximum  number  of  items 
within  the  time  limits  of  the  test. 

The  directions  specify  that  the  examinee  indicate  how  he  would  actu¬ 
ally  handle  each  of  the  situations  described  in  the  test.  He  is  told  that 
there  arc  no  right  or  wrong  answers.  If  he  docs  not  definitely  prefer  any 
of  the  alternatives  given  him,  he  is  required  to  select  the  one  which 
comes  closest  to  describing  what  he  would  do. 

(3)  Scoring. — In  order  to  derive  an  a  priori  score  of  self-confidence, 
the  alternatives  in  each  item  were  scaled  by  several  "expert”  raters  in 
terms  of  the  degree  of  self-confidence  revealed.  Nearly  all  of  the  40 
items  have  4  alternatives,  which  were  ranked  by  raters  with  a  relatively 
high  degree  of  consistency.  The  test  is  scored  for  only  the  first  30  of 
the  40  items,  since  many  of  the  aviation  students  were  unable  to  com¬ 
plete  the  test  in  the  time  allowed.  The  least  confident  answer  to  each 
item,  as  rated  by  judges,  is  scored  1  point,  while  the  most  confident  an¬ 
swer  is  scored  4,  and  an  intermediate  answer  either  2  or  3.  Thus  the 

»  Developed  »t  P*jchalo|ie*l  Rttcuch  Unit  No.  I.  Chief  contributor :  L*.  Utwellyn  N.  WiWjr. 
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possible  range  of  scores  is  from  a  minimum  of  30  to  a  maximum  of  120. 

Statistical  results. — Data  are  available  for  approximately  600  avia¬ 
tion  students  tested  at  Psychological  Research  Unit  No.  1  during  the 
period  May  31  to  June  2S,  1943. 

(1)  Distribution  statistics.-  -The  range  of  scores  is  from  40  to  95 
with  a  median  of  69. 

(2)  Reliability  coefficient. — The  odd-even  reliability  coefficient  ob¬ 
tained  with  a  sample  of  205  was  0.22,  corrected.  Because  of  the  very 
low  reliability  of  the  test,  no  further  analysis  of  individual  scores  was 
made;  nor  were  clinical  predictions  or  ratings  of  confidence  made  on 
the  basis  of  total  scores. 

(3)  Item  validity. — An  item-validation  study  was  made,  based  on 
pilots  in  elementary  training.  Tetrachoric  correlations  and  levels  of  sig¬ 
nificance  were  computed.  The  results  showed  that  even  the  most  dis¬ 
criminating  response  had  a  tetrachoric  r  significant  only  at  the  2  percent 
level.  In  general,  the  results  approximate  what  might  be  expected  on  a 
purely  chance  basis.  Inspection  of  the  most  discriminating  items  sug¬ 
gested  no  adequate  rationale  for  their  significance. 

Evaluation. — On  the  basis  of  a  very  low  reliability  (0.22  corrected) 
and  purely  chance  item-validity  data,  it  would  seem  that  this  behavior  or 
preference  questionnaire  is  of  little  promise  in  predicting  air-crew  success 
or  success  in  any  other  type  of  endeavor. 

A  study  of  response  frequencies  reveals  that  for  most  of  the  items  a 
large  percentage  of  the  examinees  selected  one  alternate  in  preference  to 
the  remaining  three,  probably  because  of  the  strong  social  approbation 
connected  with  that  choice.  This  failure  of  the  items  to  yield  good  dis¬ 
tributions  of  responses  is  ascribed  to  faults  in  the  wording  of  the  alter¬ 
nates,  which  left  too  obvious  differences  in  terms  of  social  desirability 
or  undesirability. 

Evaluation  of  Measures  of  Confidence 

The  evidence  presented  regarding  the  validity  of  measures  employed 
in  this  area  is  almost  entirely  inconclusive.  This  trait,  it  it  be  general 
and  consistent  in  individuals,  has  proved  extremely  difficult  to  quantify. 
In  this  respect  the  trait  resembles  other  traits  of  temperament,  many  of 
which  have  thus  far  evaded  measurement. 

In  the  light  of  the  results  of  these  tests,  it  appears  that  extensive 
exploration  should  be  made  in  an  effort  to  find  some  reliable  medium 
or  media  for  measuring  confidence  and  to  determine  the  amount  of  com- 
munality  that  exists  among  tests  designed  to  measure  it.  The  self-ratings 
of  performance  on  apparatus  tests,  which  probably  yielded  the  purest 
measures  of  confidence  used,  exhibited  only  moderate  correlations  with 
primary  pilot  graduation-elimination.  It  may  be,  of  course,  that  self- 
confidence  is  not  significantly  correlated  with  pilot  success.  It  seems 
more  likely,  however,  that  a  reliable  measure  of  confidence  has  not  yet 
been  discovered. 
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MEASURES  OF  SOCIAL  INTELLIGENCE  AND  LEADERSHIP 


Analysis  of  the  jobs  of  fighter  and  bomber  pilots  was  made  to. deter¬ 
mine  the  basis  or  bases  upon  which  a  more  valid  method  of  assignment 
could  be  devised.  This  analysis  indicated  that  the  responsibility  of  the 
bomber  pilot  for  other  personnel  of  the  crew  and  the  interaction  of  per¬ 
sonalities  resulting  from  the  closeness  of  contacts  among  crew  members 
demand  that  the  bomber  pilot  possess  high  leadership  ability.  The  fighter 
pilot,  on  the  other  hand,  has  little  contact  with  others  while  in  action, 
during  which  he  experiences  his  greatest  stress.  His  social  relationships 
appear,  therefore,  to  be  much  less  significant  to  the  task  than  are  those 
of  the  bomber  pilot.  These  findings  suggested  that  some  measure  of 
social  aptness  and  leadership  ability  would  assist  in  singling  out  the 
pilots  likely  to  be  successful  as  bomber  pilots. 


Social  Manipulation  Inventory,  CE443A  ” 

It  appeared  that  most  of  the  requirements  for  such  a  measure  would 
be  met  by  a  social-intelligence  test  in  which  problem  situations  are  de¬ 
scribed  and  the  examinee  is  required  to  indicate  the  best  solution.  Solu¬ 
tions  of  the  problem  situations  should  involve  some  understanding  of 
human  motivation  and  of  individual  differences,  with  emphasis  upon  such 
techniques  as  the  use  of  praise  and  blame,  delegation  of  authority,  detec¬ 
tion  and  removal  of  frictions  within  groups,  and  the  like.  If  the  alterna¬ 
tive  solutions  appear  equally  plausible  and  socially  acceptable,  a  good  in¬ 
dication  should  be  obtained  of  what  the  examinee  would  do  in  similar 
real-life  situations. 

Description. — This  is  a  purely  verbal  test  similar  to  the  usual  judg¬ 
ment  test.  It  is  not  an  inventory  of  the  questionnaire  type. 

(1)  Internal  characteristics. — The  inventory  consists  of  SO  items,  each 
of  which  depicts  a  problem  situation  which  might  confront  an  officer  or 
other  person  having  authority  over  others.  In  each  item  there  arc  five 
alternative  courses  of  action  presented.  These  alternatives  were  selected 
as  the  most  appropriate  from  a  list  of  responses  given  by  unclassified 
aviation  students  in  frcc-response  interviews.  Typical  items  follow.  The 
responses  preceded  by  an  asterisk  received  a  -H  score;  all  other  re¬ 
sponses  were  scored  —1. 

You  are  a  supervisor  of  an  office  force  of  10  people.  One  member  is  habitually 
late.  You  would: 

A.  Make  an  example  of  him  by  discharging  him. 

B.  Bawl  him  out  in  front  of  the  whole  group. 

*C  Call  him  in  and  try  to  find  out  the  reason  for  the  tardiness. 

D.  Call  a  meeting  of  the  office  force  to  explain  that  everyone  owes  it  to  the 

company  to  be  on  time. 

E.  Call  him  in  privately  for  a  lecture  on  the  importance  of  being  on  time 


» iWrloo«r»t  PiTcholosicit  Rtifirch  Uni»  No.  J.  Chirf  tontributorol  S/S«C 
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When  minor  punishment  was  necessary  for  your  crew,  you  would : 

•A.  Hand  out  the  punishment  yourself  and  bring  no  public  attention  to  it. 

B.  Have  the  crew  decide  the  punishment  for  its  members. 

C.  Turn  the  matter  over  to  Wing  Headquarters  to  handle. 

D.  Use  sarcasm  instead  of  material  punishment. 

E.  Establish  fixed  punishments  for  the  usual  infractions  so  that  punishment 

would  be  automatic 

(2)  Administration. — The  directions  to  the  test  emphasize  the  fact 
that  there  arc  no  right  answers,  but  that  each  person  is  to  respond  ex¬ 
actly  as  he  thinks  he  would  under  the  circumstances  described.  Although 
not  all  possible  courses  of  action  arc  listed,  the  examinee  is  required  to 
respond  to  every  item,  even  if  he  finds  difficulty  in  deciding  which  alter¬ 
native  is  best 

(3)  Scoring. — Owing  to  the  nature  of  the  material  covered  in  this 
inventory,  the  key  was  of  necessity  determined  subjectively.  After  prepa¬ 
ration  of  the  form,  12  aviation  psychologists  and  psychological  assistants 
were  asked  to  indicate  their  judgment  as  to  which  alternatives  were  ap¬ 
propriate  and  showed  the  best  type  of  leadership  ability.  In  key  A,  36  of 
the  50  questions  were  scored.  In  7  of  these,  2  alternatives  were  scored 
as  desirable  or  +1,  making  a  total  of  43  desirable  responses  in  the  test. 
The  criterion  for  scoring  a  response  was  that  90  percent  or  more  of  the 
judges  agree  independently  as  to  the  desirable  and  undesirable  responses. 
Almost  complete  agreement  on  a  large  proportion  of  the  items  led  to  the 
decision  not  to  score  the  remaining  14  items,  pending  item-analysis  evi¬ 
dence  concerning  their  correlation  wilh  the  total  score  on  the  36.  The 
formula  R— W/4+20  was  used  in  scoring  with  key  A,  R  indicating  the 
number  of  responses  receiving  positive  weight  and  W,  the  number  receiv¬ 
ing  negative  weight. 

Analysis  of  the  results  of  the  first  administration  revealed  that  10  of 
the  original  36  items  had  low  internal-consistency  phi  values  with  total 
score.  A  new  key  (B)  was  therefore  made,  scoring  26  of  the  36  items 
scored  in  key  A.  The  original  papers  were  rcscorcd  on  this  key  for  rights 
only. 

Item  analysis  of  the  original  sample  scored  with  key  B  indicated  that 
some  of  the  Id  originally  unscorcd  items  were  highly  correlated  with  the 
score  obtained  with  key  B.  Seven  of  these  items  were,  therefore,  added, 
and  key  C  was  made.  The  formula  R— W/4  +  20  was  used  with  key  C. 

Statistical  results. — Data  arc  limited  to  distribution  constants,  estimates 
of  reliability,  and  a  few  correlations  for  samples  of  pilots  in  classes  44H 
and  441.  who  were  tested  in  basic  training  by  personnel  of  Psychological 
Research  Unit  No.  3. 

(1)  Distribution  statistics. — Typical  examples  of  distribution  statistics 
obtained  on  this  test  arc  given  in  table  25.18.  The  distribution  curves  are 
approximately  symmetrical  and  normal. 
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Table  25.18. —  Distribution  constants  for  Social  Manipulation  Inventory,  CE443A, 


based  upon  two  groups  of  classified  pilots 


K*r 

i 

N 

u 

SD 

Scoring  formula 

B  . 

7so 

10. 2 

1.2 

Right!  only 

c . 

7  SO 

1S.0 

4.6 

R-W/4  +  J0 

(2)  Internal  consistency. — Analysis  of  responses  based  on  various 
keys  yielded  the  internal-consistency  data  given  in  table  25.19.  The  data 
were  all  based  on  the  highest  27  percent  and  lowest  27  percent  of  groups 
of  750  classified  pilots. 


Table  25.19. —  Internal-consistency  data  for  Social  Manipulation  Inventory,  CE443A 


Key 

Number  of 
scored  items 

SsiBpIo 

Md 

SD* 

Ring*  of  0 

Low 

High 

A 

SO 

1 

0.20 

0.09 

0.02 

0.17 

A 

S6 

I 

,2J 

.07 

.10 

.17 

B 

26 

1 

.JJ 

.OS 

.14 

.41 

C 

I) 

I 

.n 

.06 

.12 

.1* 

c 

JJ 

II 

.27 

.06 

.09 

.42 

(3)  Reliability  coefficient. — Two  samples  yielded  the  estimates  of  reli¬ 
ability  given  in  table  25.20. 


Table  25.20. —  Estimated  reliability  coefficients  (odd-even)  for  Social  Manipulation 

Inventory ,  CE443A 


Group 

N 

Key 

& 

Classified  pilot*  .... 

7S0 

B 

0.2S 

0.40 

Classified  pilot*  .... 

7S0 

C 

.14  ! 

.51 

Evaluation. — Although  this  inventory  was  not  validated,  certain  perti¬ 
nent  information  was  obtained  in  the  form  of  correlations  with  other 
measures.  The  correlation,  corrected  for  attenuation  of  both  variables, 
with  a  measure  of  Reading  Comprehension  was  0.32,  based  on  an  N  of 
551.  The  same  sample  yielded  a  correlation  of  0.50  (corrected  for  attenu¬ 
ation)  with  the  composite  navigator  aptitude  score.  A  sample  of  556 
cases  yielded  a  correlation  of  0.15  (corrected  for  attenuation)  with  the 
composite  pilot  aptitude  score. 

This  evidence  seemed  to  indicate  a  substantial  positive  relationship  be¬ 
tween  academic  intelligence  and  the  characteristics  measured  by  this  in¬ 
ventory.  If  the  hypothesis  upon  which  the  measure  was  based  be  true, 
the  evidence  might  indicate  that  those  pilots  who  have  aptitudes  most  like 
those  of  navigators  make  Utter  bomber  pilots.  It  is  jK»'.sil>le,  on  the 
other  hand,  that  the  evidence  means  that  the  test  measures  verbal  and 
other  intellectual  abilities,  but  that  it  docs  not  necessarily  imply  corre¬ 
sponding  intellectual  content  in  the  task  of  the  successful  bomber  pilot. 
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Its  reliability  is  so  low  that  it  would  probably  be  useful  only  in  conjunc¬ 
tion  with  other  tests  in  a  battery. 

Pilot  Behavior  Blank,  CE444A  11 

As  a  result  of  the  investigation  of  pilot  specialization,  referred  to  in 
the  report  on  the  Social  Manipulation  Inventory,  it  appeared  wise  to  ex¬ 
plore  the  problem  of  the  character  or  types  of  leadership.  The  Ixwinian 
division  of  leadership  into  three  types — laissez-faire,  authoritarian,  and 
democratic— was  considered  to  be  a  good  basis  upon  which  to  begin  the 
construction  of  an  instrument  for  determining  the  presence  and  quality 
of  leadership  ability.  Presumably,  s:,<'h  an  instrument  should  be  useful 
in  assembling  crews  of  similar  tastes  with  respect  to  social  interaction.  A 
further  assumption  was  that  those  showing  preference  for  democratic 
rather  than  laissez-faire  or  authoritarian  types  of  leadership  should  be 
more  successful  as  bomber  pilots  where  considerable  social  interaction 
takes  place.  By  the  same  token,  the  type  of  social  interaction  preferred 
by  the  fighter  pilot  would  be  rather  unimportant,  since  he  has  r.o  crew 
ol  his  own  and  exerts  little  authority  over  others. 

In  the  light  of  these  assumptions,  a  preference  blank  was  prepared  in 
which  a  conscious  effort  was  made  to  introduce  in  approximately  equal 
numbers  laissez-faire,  authoritarian,  and  democratic  solutions  to  leader¬ 
ship  situations  without  identifying  them  for  the  examinee  or  prejudic¬ 
ing  his  choice.  After  a  large  number  of  items  had  been  prepared,  10  avia¬ 
tion  psychologists  and  psychological  assistants  keyed  all  the  choices, 
indicating  which  of  the  three  types  of  leadership  each  hoice  indicated 
(L=laisscz-faire,  A = authoritarian,  D= democratic). 

Description. — This  blank  consists  of  90  two-alternative  items.  The 
alternatives  describe  pilots  with  characteristically  different  ways  of  han¬ 
dling  situations  involving  leadership,  authority,  and  the  like.  The  examinee 
is  required  to  indicate  which  pilot  he  prefers.  The  following  samples  are 
typical : 

19.  A.  The  pitot  who  leu  the  crew  members  make  their  own  arrangements  for 
quarters,  mess,  and  entertainment. 

D.  The  pilot  who  talks  about  the  crew's  good  points  with  others. 

3A  A  The  pilot  who  gives  many  instructions. 

11  The  pilot  who  white  he  works  more  energetically  than  the  rest  of  the 
crew,  doesn't  expect  them  to  work  as  hard  as  he. 

Ml  A.  The  pilot  who  is  so  engrossed  in  his  own  duties  that  he  hasn't  the  time 
to  try  to  understand  the  difficulties  of  others. 

B.  The  pilot  who  gives  his  crew  exact  information  for  doing  a  job  or 
carrying  out  an  order. 

(1)  Administration. — Directions  on  the  front  page  of  the  blank  are 
intended  primari’y  to  prepare  the  examinee  for  the  material  to  follow. 

»  l«nrtk  Unit  Nt  J.  Cfcnf  S/Sft.  BcnjimU 
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Cirt.it  stress  is  laid  upon  the  need  for  knowing  what  crew  members  arc 
likt,  mi  that  congenial  individuals  can  he  placed  together.  “Clash  of  j*t- 
.  onnlitits  is  cite  1 1  as  a  major  cause  of  incompatibility  among  members  of 
a  group.  1  .ramintes  arc  directeil  to  consider  the  pilots  described  as  al¬ 
ways  equal!)  capable  of  (lying.  I  he  examinee  is  therefore  required  to 
select  one  of  the  two  alternatives  purely  on  the  basis  of  his  own  per¬ 
sonal  preference. 


(2)  Scoring,  lhc  original  plan,  to  score  items  according  to  the  three 
Lewinian  catcg°ries,  was  discarded  in  favor  of  a  system  more  in  line 
with  the  purpose  of  the  Blank,  which  was  to  select  men  better  fitted  for 
assignment  as  bomber  pilots.  To  accomplish  this,  the  aviation  psycholo¬ 
gists  and  psychological  assistants  examined  the  alternatives  of  all  items 
carefully  and  indicated  independently  which  alternative  for  each  more 
closely  described  the  good  crew  commander.  Almost  unanimous  agree¬ 
ment  was  reached  in  many  of  these  evaluations.  As  a  result  of  this  study, 
the  choices  were  keyed  for  bomber  pilot  vs.  nonbomber  pilot,  the  bomber 
choices  being  keyed  plus  1  and  others  minus  1,  for  dO  selected  items 
out  of  the  total  of  90.  Sufficient  agreement  could  not  be  achieved  on  the 
remaining  50  items  to  warrant  their  being  scored.  It  is  interesting  to  note 
that  the  choices  selected  as  good  indicators  for  bomber  pilots  were  in 
almost  every  ease  those  designated  as  “democratic"  in  the  first  survey. 
This  fact  tends  to  be  justified  by  the  findings  of  Lewin  in  regard  to  the 
success  of  different  types  of  leadership.  The  score  is  the  algebraic  sum 
of  the  weights  of  the  scored  responses. 

Statistical  results.— Statistics  on  this  test  arc  confined  to  internally  de¬ 
rived  data,  based  upon  pilots  in  classes  44H  and  441,  tested  during 
basic  training  by  personnel  of  Psychological  Research  Unit  No.  3. 

(1)  Distribution  statistics. — A  sample  of  750  classified  pilots  yielded 
a  mean  score  of  24.3  and  a  standard  deviation  of  9.7,  using  key  A. 

(2)  Internal  consistency. — Analysis  of  responses  of  several  sample 
groups  yielded  the  internal-consistency  data  given  in  table  25.21. 


Table  25.21. — Internal-consistency  dole  for  items  of  Pilot  Behavior  Blank,  CE444A, 
based  ufon  samples  of  classified  pilots 
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<0 
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B 

*0 

U 
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Key  A  was  the  first  subjectively-derived  key.  After  analysis  of  results 
of  the  application  of  this  key,  it  appeared  that  some  choices  should  be 
dropped  and  others  added  to  secure  the  most  internally  consistent  test. 
Revision  of  'lie  key  was  made  in  the  light  of  this  analysis,  and  the  original 
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sample  rescored  (key  B).  A  second  sample  scored  with  key  B  yielded 
similar  results. 

(3)  Rtliul'ility  coefficient.  Reliability  of  this  instrument  was  esti¬ 
mated  by  the  split-half  method.  The  40  scored  items  (key  B)  were 
divided  into  2  groups  of  20  each.  The  items  in  the  two  groups  were 
equated  for  content  (laissez-faire,  authoritarian,  or  democratic  choices) 
and  for  “difikullv.”  An  estimated  reliability  coefficient  of  0.68,  corrected 
for  length,  was  obtained.  This  figure  is  based  on  a  sample  of  747  classi¬ 
fied  pilots. 

(4)  Difficulty. — Since  there  are  no  right  answers  to  the  items  in  this 
blank,  a  difficulty  level,  strictly  speaking,  cannot  be  obtained.  The  mean 
proportion  of  preferred  (bomber  pilot  or  democratic)  responses  is  mean¬ 
ingful,  however.  This  figure  for  the  40  items  ;n  key  B  is  0.58,  the  range 
being  frojn  0.23  to  0.87.  The  mean  has  not  been  corrected  for  chance, 
since  guessing  would  not  be  expected  to  enter  a  test  of  this  sort  where 
no  answers  arc  correct. 

Evaluation. — Although  validation  of  the  appropriateness  of  specializa¬ 
tion  assignment  from  the  standpoint  of  temperament  is  very  difficult,  if 
not  imoossiLe,  to  accomplish,  the  psychological  values  involved  indicate 
that  tins  should  be  one  of  the  most  useful  instruments  for  the  selection 
of  persons  to  exercise  authority.  It  is  certain  that  the  underlying  demo¬ 
cratic  principle  which  is  common  to  the  positively  scored  choices  is 
psychologically  sound.  One  possible  weakness  of  this  instrument  is  that 
individuals  may  answer  according  to  known,  socially  acceptable  standards, 
rather  than  according  to  their  real  bent.  Attempt  was  made  to  eliminate 
this  type  of  bias,  b»*'  it  was  obviously  impossible  to  make  all  choices 
appear  equally  desi.  ..  The  mean  proportion  of  desirable  responses 
(0.58),  however,  indicates  that  this  Lias  was  largely  eliminated.  Evidence 
that  the  blank  does  not  involve  verb;.*  variance  lies  i.a  its  correlation  with 
Reading  Comprehension.  A  sample  of  535  pilots  yielded  a  correlation  of 
only  0.10,  corrected  for  attenuation  in  both  varables.  It  appears,  then, 
that  the  blank  might  well  show  positive  correlation  with  measures  of 
leadership  whenever  satisfactory  criteria  arc  found. 

Evaluation  of  Measures  of  Social  Intelligence  in  Leadership 

Although  statistical  proof  of  the  usefulness  of  the  tests  described  in 
this  section  is  lacking,  certain  considerations  suggest  that  the  approach 
employed  is  one  of  the  most  promising  in  the  field  of  temperament.  The 
trait  described  here  as  social  intelligence  is  probably  one  of  the  most 
important  determinants  of  success  in  personal  relations.  If  this  trait  car 
be  measured,  the  results  will  have  far  broader  significance  than  for  merely 
air-crew  selection  or  pilot  specialization. 

Results  of  investigation  in  this  area  revealed  or  further  emphasized 

difficulties  that  hinder  the  construction  of  reliable  measures.  In  common 
* 

with  many  other  temperament  tests,  these  instruments  are  somewhat 
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susceptible  to  dishonest  manipuLtion.  Rapport  is  therefore  extremely 
important  to  the  achiewimnt  of  validity.  Another  danger  is  that  ex¬ 
aminees  will  answer  according  to  considered  intellectual  judgments  rather 
than  according  to  emotional  reactions  such  as  would  come  into  play  in 
facc-to-face  situations. 

EVALUATION  OF  TESTS  OF  SPECIFIC  TRAITS 
OF  TEMPERAMENT 

The  heterogeneity  of  the  trails  evaluated  by  the  instruments  described 
in  this  chapter  makes  it  impossible  to  apply  any  one  evaluative  descrip¬ 
tive  statement  to  them.  In  general,  the  degrees  of  success  or  failure 
achieved  have  been  noted  following  the  various  test  or  area  discussions. 
Probably  the  outstanding  results  reported  in  this  chapter  are  those  ob¬ 
tained  in  the  study  of  the  carefulness  tests.  The  character  of  the  tasks 
involved  and  the  identity  of  most  of  the  factors  found  seem  to  indicate 
that  the  tests  resemble  aptitude  tests  more  than  they  do  temperament  tests. 
In  spite  of  this  fact,  however,  these  findings  very  strongly  suggest  that 
the  factor  analysis  technique  should  be  applied  in  the  study  01  tempera¬ 
ment  as  well  as  intellectual  measurements. 

The  experience  gained  through  the  development,  statistical  treatment, 
and  results  of  tests  here  reported,  has  suggested  more  strongly  than  ever 
the  necessity  of  achieving  or  adhering  to  certain  additional  standards  in 
the  construction  of  tests  in  the  temperament  area.  These  standards  are 
notable  in  this  connection,  chiefly  because  they  have  already  become 
practically  axiomatic  in  the  areas  of  sensation,  perception,  and  intellect. 

Probably  many  requirements  or  standards  could  be  listed,  but  four 
appear  to  be  especially  pertinent  in  this  connection.  I'irst  is  the  necessity 
of  maintaining  a  high  level  of  objectivity.  It  may’  be  argued  that  complete 
objectivity  cannot  be  attained  in  the  measurement  of  traits  of  tempera¬ 
ment.  Although  this  may  be  true  in  part,  it  must  certainly  be  agreed  that 
the  maximum  attainable  objectivity  is  desirable.  4 

A  second  requirement  is  that  of  reliability.  Under  this  heading  might 
be  listed  the  desirability  of  having  relatively  homogeneous  material,  it 
appears  that  many  measures  in  the  area  of  temperament  ha\c  covered 
such  a  wide  variety  of  trails  or  functions  that  they  constitute  a  reliable 
measure  of  none. 

A  third  requirement  is  also  associated  with  reliability.  It  appears  neces¬ 
sary  to  eliminate  the  clement  of  social  acceptability  by  making  alterna¬ 
tives  subject  to  equal  or  nearly  equal  social  approbation.  This  goal  is  obvi¬ 
ously  difficult  to  attain  when  the  characteristics  being  investigated  are  fre¬ 
quently  associated  with  antisocial  behavior.  Valid  measurement  of  traits 
of  temperament  cannot  be  achieved,  however,  by  means  of  instruments 
that  are  subject  to  gross  intellectual  manipulation. 

A  fourth  requirement  has  to  do  with  the  traits  and  characteristics 
selected  for  study.  There  should  be  a  logically  sound  rationale  for  both 
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the  study  of  the  trait  and  the  method  to  be  used.  Faulty  premises,  illogical 
methods,  and  irrelevant  evidence  do  not  lead  to  positive  results. 

The  four  suggestions  listed  obviously  do  not  cover  all  the  necessary 
rules  to  be  observed  in  the  construction  of  temperament  tests.  They  do 
cover  weaknesses  which  have  been  particularly  noticeable  in  the  tests 
reported  in  this  chapter.  It  is  felt  that  as  these  requirements  are  met  in 
test  construction,  instruments  of  greater  usefulness  will  result. 


COAPIIB  HMT-SII 


Measures  nf  Motivation1 


INTRODUCTION 

Importance  of  Pleasures  of  Motivation 

All  individuals  concerned  with  classification  of  aviation  students  con¬ 
ceded  that  it  is  important  to  place  men  in  the  type  of  training  in  which 
they  have  the  most  interest  and  motivation,  or,  at  least,  not  to  assign 
them  to  training  in  which  they  have  little  or  no  interest.  Evidence  is 
available  that  indicates  the  importance  of  motivation  in  both  training  and 
combat.  This  evidence  is  given  in  part  in  chapter  1  and  is  reviewed  in 
chapter  22. 

An  Over-all  View  of  Motivation  Test  Development 

In  developing  measures  of  motivation,  two  approaches  were  taken: 
(1)  a  self-assessment  by  the  student  through  his  statement  of  preferences 
and  (2)  a  more  objective  approach  through  tests  of  attitudes  and  interests. 

Preference  statements. — In  the  light  of  the  evidence  concerning  the 
importance  of  motivation,  arrangements  were  made  to  permit  students  to 
express  their  degrees  of  interest  in  and  preferences  for  the  different 
air-crew  positions.  Especially  in  the  beginning  of  the  classification  pro¬ 
gram,  and  to  some  extent  as  late  as  May  1945,  student  preferences  were 
used  as  a  guide  to  classification.  At  times,  quotas  seriously  interfered  with 
the  assignment  of  men  to  the  air-crew  positions  they  preferred.  Unofficial 
letters  and  field  trips  to  various  flying  schools  confirmed  the  expectation 
that  this  procedure  would  cause  a  considerable  lowering  of  student 
morale.  When  the  training  program  was  at  its  peak,  therefore,  and  many 
candidates  were  admitted  to  training,  expressed  preferences  were  fol¬ 
lowed  as  much  as  possible  in  assignment.  Besides  the  limitation  of  quotas, 
air-crew  aptitude  scores  became  another  factor  interfering  with  this 
policy,  particularly  as  qualifying  standards  in  terms  of  aptitudes  rose. 
By  June  1945  the  qualifying  scores  (stanines)  were  so  high  that  only  a 
small  proportion  of  the  candidates  could  be  assigned  to  training,  and 
preference  statements  were  no  longer  obtained. 

The  appropriate  instrument  for  obtaining  the  student’s  statement  of 
preference  was  not  as  easy  to  write  as  might  lx-  supposed.  The  preference 
blank,  went  through  several  revisions,  as  will  be  related  in  the  following 
pages. 

The  first  preference  blank,  introduced  early  in  1942,  included  a  list 

*  Written  by  Sgt.  I)»\i<t  GroJsuun, 
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of  eleven  air  and  ground  duties.  The  student  was  required  to  rank  these 
in  order  of  his  preferences.  This  form  was  revised  very  shortly,  however, 
because  of  the  preponderance  of  high  preferences  for  air-crew  positions. 
On  the  revised  blank,  the  student  was  asked  to  rank  only  the  three  air¬ 
crew  positions.  A  blank  space  provided  for  his  indicating  any  nonair¬ 
crew  preference  he  might  have. . 

The  examinee's  first  preference  was  given  considerable  weight  in 
recommending  men  for  bombardier,  navigator,  and  pilot  training,  when 
his  aptitude  scores  were  relatively  uniform.  For  example:  an  aviation 
student  indicated  his  preferences  as  pilot  first,  navigator  second,  and 
bombardier  third.  Ilis  stanines  were  6  for  bombardier,  8  for  navigator, 
and  7  for  pilot,  which  qualified  him  for  all  three  air-crew  positions.  Be¬ 
cause  the  student’s  preferences  were  known,  it  was  possible  to  recom¬ 
mend  him  for  bis  first  preference;  namely,  pilot. 

The  preference  waiver. — An  additional  device  known  as  the  preference 
waiver  was  also  included  in  this  later  form  of  the  preference  blank. 
Primarily,  the  preference  waiver  was  introduced  because  further  in¬ 
formation  concerning  motivation  was  needed.  Over  85  percent  of  the  ex¬ 
aminees  chose  pilot  training  as  their  first  preference.  Since  such  a  large 
proportion  of  examinees  could  not  be  assigned  to  pilot  training,  some 
means  had  to  be  provided  for  the  student  to  express  his  willingness  to 
be  classified  according  to  the  test  results  rather  than  by  his  preferences. 

The  strength-of -interest  scale. — The  limitations  of  the  ranking  of 
preferences  soon  became  apparent,  and  research  studies  were  instituted 
to  devise  a  better  technique  for  measuring  the  student’s  interest.  A  differ¬ 
ent  type  of  preference  blank,  known  as  the  strength-of-interest  scale,  was 
constructed,  which  allowed  the  examinee  to  express  his  strength  of  interest 
for  each  category  on  a  graphic  rating  scale,  with  descriptive  categories 
ranging  from  "little  or  no  interest”  to  “exceptionally  strong  interest,”  as 
shown  in  figure  26.1. 

I  23456  7  6  9 

_ 1 _ I _ I _  1  _ L  . ...  I _ 1 _ I _ I _ 

urrtt  mr  tr+t**  rrc*»rn»**jtr 

Mtnretr  M/mar  Mrrs*nr  swwrrjr  /nw#  MrrMnr 

FIGURE  26.1 

SAMPLE  STRENGTH-OF-INTEREST  SCALE  USED  IN  THE 

PREFERENCE  BLANK 

Both  the  preference  waiver  and  sc rength-of -interest  scale  indicated 
which  students  needed  to  be  interviewed  prior  to  classification  for  a  type 
of  training  other  than  their  first  preference.  If  a  student’s  stanines  were 
bombardier  6,  navigator  8,  and  pilot  3,  his  strengths  of  interest,  bombar¬ 
dier  3,  navigator  3,  and  pilot  9,  and  be  had  indicated  that  he  would  not 
willingly  accept  assignment  to  training  other  than  that  of  his  first 
preference,  he  would  he  interviewed  before  be  was  recommended  for 
navigator  training. 
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At  the  same  time  that  preference  blanks  were  undergoing  improvements, 
efforts  were  being  made  to  assess  motivation  more  objectively  in  terms 
which  could  be  included  in  the  composite  aptitude  scores.  It  was  never 
felt  that  subjective  expression  of  interest  was  accurate  enough,  no  matter 
how  well  scaled,  to  be  included  in  the  stanines. 

Tests  of  attitudes  and  interests. — Experience  with  various  types  of 
attitude  and  interest  tests  will  be  related  in  this  chapter.  They  include: 
Satisfaction,  CE409A,  B,  C,  and  D;  Aviation  Preference  Check  List 
(no  code)  ;  Inventory  of  Experiences,  Interests,  and  Attitudes,  CE612- 
AX2 ;  Specialization  Preference  Inventory,  CE610A ;  Specialization  In¬ 
terest  Inventory,  CE  609A;  Social  Concepts,  CE  512A;  Survey  of 
Personal  Attitudes,  CE  508A;  Inventory  of  Attitudes,  CE518A;  Conduct 
of  the  War  Test,  CE520A ;  and  Home  Front  Attitude  Inventory,  CE446A. 

PREFERENCE  BLANKS 

Aviation  Cadet  Training  Preference  Blank,  CE501E* 

The  preference  blank  differs  from  most  classification  instruments  in 
that  it  is  not  a  test.  Form  E  of  the  blank  will  be  described  in  much  detail, 
since  it  was  used  over  the  longest  period  of  time,  and  since  it  is  the  result 
of  the  accumulated  experience  gained  in  using  previous  forms,  CE501A 
and  D,  and  CE  509A. 

Description. — In  the  first  part  of  the  blank  the  examinee  is  asked 
to  state  his  degree  of  interest  in  each  type  of  training  by  encircling  the 
number  which  represents  that  degree  of  interest.  A  sample  scale  is  shown 
in  figure  26.1. 

There  are  three  scales,  one  for  each  of  the  three  types  of  air-crew  train¬ 
ing.  If  the  examinee  has  a  stronger  interest  in  any  other  type  of  training, 
he  can  name  it  and  express  this  interest  on  a  fourth  scale. 

There  are  many  reasons  for  using  a  graphic  rating  scale  rather  than  a 
ranking  of  preferences  as  in  the  A  and  D  forms  of  the  blank.  In  the  first 
place,  the  linear  scale  allows  for  ties.  Second,  it  permits  an  aviation 
student  to  show  unequal  differences  in  interest  between  his  first  and 
second,  and  his  second  and  third  preferences.  Ranking  of  preferences 
implies  equal  distances.  Third,  it  presents  a  satisfactory  way  in  which 
an  examinee  can  show  lack  of  interest  for  one  or  more  types  of  air-crew 
training.  Fourth,  it  allows  the  examinee  to  consider  each  air-crew  posi¬ 
tion  separately,  on  its  own  merits,  instead  of  in  relation  to  the  other  two 
positions.  The  examinee  is  forced  to  think  much  more  carefully,  since 
he  is  required  to  express  himself  in  more  specific  terms.  Finally,  this 
method  enables  the  examinee  to  express  his  first  preference  as  somewhat 
below  maximum  strength.  A  student  who  gives  ranked  preferences  of 
bombardier  2,  navigator  l,  and  pilot  3,  may  not  have  “exceptionally  strong 
interest"  for  navigator,  as  ranking  might  imply. 

i  Developed  it  Psychologic*!  Re»c*rck  Unit  No.  I.  Chief  contributor:  MiJ.  Frederick  Wickett 
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On  the  other  hand,  graphic  rating  scales  have  some  disadvantages. 
Because  the  linear  scale  did  allow  for  ties,  interviews  were  often  re¬ 
quired  before  classification  recommendations  were  made.  The  most 
serious  disadvantage  is  that  one  may  be  misled  into  assuming  that  the 
points  on  the  scale  have  the  same  meanings  for  all  examinees. 

The  second  part  of  the  preference  blank  is  called  the  preference  waiver. 
The  examinee  must  check  one  of  the  following  statements: 

1.  1  want  to  be  assigned  to  the  kind  of  air-crew  training  for  which  I  show  the 
greatest  ability  on  the  tests. 

2.  I  want  to  be  assigned  to  the  kind  of  air-crew  training  for  which  I  show  the 
greatest  ability  on  the  test  only  if  my  ability  for  that  kind  of  training  is  much 
greater  than  for  any  other  kind. 

3.  I  want  tp  be  assigned  to  the  kind  of  air-crew  training  in  which  I  am  most 
interested  unless  the  tests  show  that  I  should  probably  fail  in  that  kind  of  training. 

4.  I  want  to  be  assigned  to  the  kind  of  air-crew  training  in  which  I  am  most 
interested  even  if  the  tests  show  that  I  should  probably  fail  in  that  kind  of  training. 

These  statements  are  worded  simply  and  in  nontechnical  terms.  It 
was  found  that  many  students  experienced  serious  difficulty  in  com¬ 
prehending  the  statements  in  the  earlier  Form  D.  The  wording  was  too 
academic  and  was  written  from  the  standpoint  of  a  psychologist  rather 
than  from  that  of  an  aviation  student.  During  interviews,  students  ex¬ 
pressed  inability  to  understand  such  concepts  as  "distinctly  higher," 
"aptitude,”  or  "prediction.” 

(1)  Administration. — The  preference  blank  was  administered  at  the 
beginning  of  the  first  session  of  group  testing,  and  it  required  approxi¬ 
mately  10  minutes.  No  time  limit  was  set.  The  following  are  excerpts 
from  the  directions  to  the  preference  blank: 

It  is  important  that  every  cadet  be  assigned  to  a  kind  of  air-crew  training  in 
which  he  can  succeed.  Two  factors  that  determine  how  well  a  cadet  will  succeed 

are : 

A.  His  scores  on  the  classification  tests,  which  measure  how  much  ability  he 

has  for  bombardier,  navigator,  and  pilot  training. 

B.  How  strong  an  interest  he  has  in  each  type  of  training.  Evidence  shows 

that  a  cadet  is  more  successful  in  a  type  of  training  in  which  he  is 

intensely  interested. 

In  stating  your  interests,  you  should  consider  these  matters  carefully: 

A.  How  much  you  know  about  the  duties  of  each  member  of  the  air  crew. 

B.  Whether  your  own  ability,  education,  and  training  in  your  own  judgment 

fit  you  for  one  kind  ol  work  rather  than  another. 

C.  How  much  you  desire  each  type  of  training,  and  how  uniting  you  are  to 

work  hard  to  succeed  in  it 

Statistical  results.— Many  research  studies  were  instituted  in  connec¬ 
tion  with  the  preference  blank.  As  a  result,  many  data  exist.  A  repre¬ 
sentative  sampling  is  submitted  here. 

(1)  Distribution  statistics  — Typical  examples  of  distribution  statistics 
obtained  on  this  blank  arc  given  in  tabic  26.1.  Distributions  of  students 
according  to  preference  waiver  arc  shown  in  table  26.2. 
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Table  26.1. — Mci ms  and  standard  deviations  of  ratings  on  the  strength-of -interest 
scale  administered  at  the  lime  of  classification  testing 


Pilots  in  primary  training*  .... 


1  In  class  44B.  Tested  at  Psychological  Research  Un:t  No.  J. 

*  In  class  44C.  Tested  at  Psychological  Research  Units  Nos.  1,  2,  and  J. 

4  In  classes  43-12  to  43-1 S  inclusive.  Tested  at  Psychological  Research  Units  Nos.  I,  2,  and  3. 

Table  26.2. —  Percentages  of  aviation  students  selecting  each  type  of  preference 
waiver  administered  at  the  time  of  classification  testing 
r _  I  „  I _ Type  of  waiver* _ . 


N 

Wjr 

if. 

SD, 

SD„ 

707 

5.6 

S.I 

8.5 

1.9 

2.0 

11,423 

5.5 

5.9 

8.6 

1.9 

2.1 

1,953 

5.1 

7.2 

7.9 

,.9 

1.9 

Pilots  in  primary  training* . . 

Do*  . 

Navigators  in  advanced  training4  . . . . 


38.5 

12.5 

44.9 

3.1 

29.3 

10.2 

52.1 

3.4 

34.0 

14.4 

46.4 

S.2 

1  For  types  see  page  000. 

*  In  class  44B.  Tested  at  Psychological  Research  Unit  No.  3. 

*  In  class  44C.  Tested  at  Psychological  Research  Units  Nos.  1,  2,  and  J. 

4  In  classes  43-12  to  43— 1 S  inclusive.  Tested  at  Psychological  Research  Units  Nos.  1,  2,  and  3. 

(2)  Validity  of  strength-of -interest  ratings. — In  tables  26.3  to  26.9 
arc  presented  the  results  of  numerous  studies  of  the  relation  between 
graduation-elimination  from  various  types  of  air-crew  training  and  the 
variables  of  strength  of  interest  and  of  first  preference  (i.  c.,  air-crew 
position  receiving  Highest  strcngth-of-intcrcst  rating). 

Table  26.3.—  Relation  of  first  preference  to  graduation-elimination  for  samples  of 
bombardier,  navigator,  and  pilot  trainees 


Group* 

N 

*. 

Obtained 

chi-square 

4P 

Values  required 
tor  significance 

5  percent  1  I  percent 

Bombardiers — 12-week  course*  . 
Bombardiers — 18-week  course*  . . 
Navigators  in  advanced  training* 
Pilots  in  Drimary  training4  .... 
Pilots  in  basic  training*  . 

1.706 

455 

1.953 

11.423 

6.702 

0.88 

.84 

.79 

.84 

.87 

5-79 

3.31 

22.80 

97.55 

1.47 

nsvcholoaici 

4 

3 

3 

3 

3 

il  researc 

9.49 

7.82 

7.82 

7.82 

7.82 

I  units. 

13.28 

11.34 

11.54 

11.34 

11.34 

*d/  =  degrees  of  freedom. 

*  In  classes  43-15  to  44-1  inclusive.  .  .  .  ,  . 

4  In  classes  43-14  to  43-13  inclusive.  The  18-week  course,  unlike  the  12-week  course,  includes 
some  training  in  navigation. 

•  In  classes  43-12  to  43-15  inclusive. 

•  In  class  44C. 

*  In  ciass  441. 

Table  26.4.—  Relation  of  first  preference  to  graduation-elimination  of  pilot  trainees 
from  primary  training  (X,  —  ll,423f  p,~0.84) _ 

First  preference  P ,*  ri»i 


Bombardier 
Navigator  . 
Pilot  . 


2.9 

1.2 

7.6 

5.8 

83.2 

71.7 

1  In  class  44C.  Tested  at  Psychological  Research  Units  Nos.  I,  2,  and  J. 

*  Percentage  preferring  type  of  training, 

•  Percentage  preferring  type  of  training  and  graduating. 

4  Split  too  extreme  for  computation  of  r,#|. 


703320 — 4/ — 42 
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Table  26.5. —  Percentages  eliminated  by  pilot  stanine  and  first  preference,  based  on 

5£01  pilots  in  primary  training1 


First  preference 

Pilot  stanine 

Bombardier: 

N  =98 

Navigator: 

N  —  254 

Pilota : 

N  =  5,149 

7-9  . 

28.6 

23.9 

13.2 

4-6  . 

43.8 

54.0 

29.6 

1-3  . 

70.4 

70.8 

49.7 

Total  . 

50.0 

47.6 

i  28.4 

Mean  pilot  stanine  . . . 

4.45 

5. SI 

1  5.34 

1  In  class  43F".  Tested  at  the  three  Psychological  Research  Unit*. 


Table  26.6. —  Relation  of  strength-of-inlercsl  rating  to  graduation-elimination  of 
pilot  trainees  from  primary  training  (N,  =  ll,423,1  p,~0.8l) 


Interest  category 

Me 

M. 

SD, 

r»«i* 

5.47 

5.72 

1.93 

-0.07 

Navigator  . . . . . 

5.88 

6.13 

2.10 

-.07 

Pilot  . 

8.66 

8.50 

.91 

.10 

'  In  class  44C.  Tested  at  the  three  Psychological  Research  Units. 

*  A  biscriai  r  of  approximately  0.03  is  required  for  significance  at  the  5  percent  level,  and 
of  0.04  at  the  1  percent  level. 


Table  26.7. —  Relation  cf  strength-of -interest  to  gradua  ion-elimination  of  navigator 


trainees  from  advanced  training  ( N,  —  lp53 /  Pt—0J9) 


Interest  category 

“e 

M. 

SD, 

'Ms* 

Bombardier  . 

5.13 

4.97 

1.92 

0.05 

Navigator  . 

7.34 

6.84 

1.94 

.15 

Pilot  . 

7.89 

7.99 

1.61 

-.03 

•In  clashes  43-12  to  43-15  inclusive.  Tested  at  the  three  Psychological  Research  Units. 

*A  bi'erial  r  of  approximately  0.06  is  required  for  significance  at  the  $  percent  level  and  of 
0.08  at  rue  1  percent  level. 


Table  26.8. —  Relation  of  strength-of -interest  to  graduation-elimination  of  1/06 
bombardier  trainees  from  12-week  course  (N 1/06,1  Pt—0.88) 


Interest  category 

Me 

K 

SD, 

Bombardier  . 

6.94 

6.83 

2.03 

9.02 

Navigator  . 

569 

5.49 

2.34 

.05 

Pilot  . 

7.68 

7.70 

1.81 

-.01 

1  In  das'es  43-15  to  44-1  inclusivn  Tested  at  the  three  Psychological  Research  Units. 

•A  biserial  r  of  approximately  0.08  is  required  for  significance  at  the  5  percent  level  and  of 
approximately  0.10  at  the  I  percent  level. 


Table  26.9.—  Relation  of  strength-of -interest  rating  to  graduation-elimination  of 
bombardier  trainees  from  the  IS-u'eek  course '  (N,  —  513f  Pf—0.86) 


Interest  category 

M. 

SD, 

fu.' 

7.00 

6.87 

t(H 

0.0) 

5.25 

4.96 

2.44 

.07 

Pilot  . . a  . 

7.60 

7.85 

1.86 

-.07 

'The  18  wee W  course  inclu.li  I  some  navigation  training. 

*  In  classes  43-14  to  43-18  inclusive.  Tested  at  the  three  Psychological  Research  Unila. 

*A  bi serial  r  of  approximately  0.13  it  required  for  significance  at  the  5  percent  level  and 
of  approximately  0.U  at  the  1  percent  level. 
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a  j  es  -6.3  and  26.1  show  that  there  is  a  slight  btit  significant  relation- 
s  up  k  tween  graduation  from  primary  pilot  training  and  first  preference. 

ns  relation,  however,  does  not  hold  for  basic  training  (table  26.3). 
bmiilarl),  table  26.6  shows  a  slight  but  significant  relationship  between 
primary  pilot  training  and  strength  of  interest.  These  results  are  con¬ 
firmed  by  the  data  as  shown  in  table  26.5.  At  each  staninc  level,  those 
whose  first  preference  is  for  bombardier  or  navigator  training  have  a 
considerably  higher  elimination  rate  from  primary  pilot  training  than 
those  whose  first  preference  is  for  pilot  training. 

hirst  preference  and  strength  of  interest  for  navigation  have  significant 
correlations  with  success  in  advanced  navigation  training  (see  tables  26.3 
and  26.7). 

.  Bombardier  first  preference  and  strength  of  interest  show  no  relation 
to  the  criterion  (see  tables  26.3,  26.8,  and  26.9). 

(3)  Validity  of  the  preference  waiver. — Tables  26.10  to  26.12  give  the 
results  of  studies  to  determine  the  relation  between  the  preference 
waiver  and  graduation-elimination  from  various  types  of  air-crew 
training. 

Table  26.10. —  Relation  of  preference  waiver  to  graduation-elimination  for  samples 
of  bombardier,  naiigator,  and  pilot  trainees1 


Croup 


Ilombardiers— 12-week  course’  ... 
Ilombardiers — 18-week  course*  . . 
Navigators  in  advanced  training* 

Pilots  in  primary  training* . 

Pilots  in  basic  training* . 


N 

f. 

Obtained 

chi-square 

dP 

Values  required 
for  significance 

5  percent 

I  percent 

1.706 

0.88 

1.6S 

3 

mm 

11.34 

<5$ 

.84 

10.35 

3 

SB»rt| 

11.34 

1,953 

.79 

6.00 

3 

11.34 

11.423 

.84 

4.44 

3 

11.34 

6.702 

.87 

9.49 

MM 

■i 

11.34 

1  All  samples  consist  of  examinees  of  the  three  Psychological  Research  Unit*. 

*  Degree*  o t  freedom. 

•In  classes  4J-15  to  44-1  inclusive. 

*  In  classes  <3-14  to  4J-li  inclusive. 

*  In  classes  <3-12  to  <3-15  inclusive. 

'  tn  class  «C. 

*  In  class  <31. 

Table  26.11. —  Relation  of  preference  waiver  to  graduation-elimination  of  pilot 
trainees  from  primary  training  (N,—  U,423?  P,  —  0  84) 


,  Preference  waiver* 

mm 

— 

rt9$ 

wmm 

-0.0J 

.06 

4*0 

-.01 

■Hi 

2.8 

<’) 

1  In  class  <<C.  Tested  at  the  three  Psychological  Research  Units. 

1  Kor  categories  see  page  724. 

*  Percentage  of  total  group  selecting  waiver. 

*  Percentage  of  total  group  selecting  waiver  and  fraduatlnf. 

•Split  too  uneven  for  computation  of  r|f(. 

The  preference  waiver  docs  net  seem  to  hear  much  relation  to  gradua¬ 
tion-elimination  (see  tables  26.10  and  26.11).  Two  significant  relations 
arc  found  between  the  5  percent  and  1  percent  levels;  one  in  a  sample  of 
pilots  in  basic  training  and  the  other  in  a  sample  of  Ixnnbardiers  taking  the 
18-weck  course.  None  of  the  other  groups  shows  relationships  significantly 
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Taw.k  26.12.-  Percentages  eliminated  by  p\loi  stanine  and  preference  waiver  based 
on  7,826  pilots  in  primary  training1 


Pilot  stanine 

j  Preference  waiver* 

!  1 

2 

3 

4 

7-9  . 

19.4 

IS.4 

11. 1 

$.6 

4  6  . 

36.4 

30.1 

27.5 

24.9 

1  3  . . 

60.2 

51.3 

46.0 

Total  . . . 

36  9 

•  28.T 

27.1 

23.7 

Mean  pilot  stanine  . . . 

5,11 

S.46 

5.32 

5.34 

1  in  cUm  -1 .) I* .  Tested  at  the  three  i\syclu>l(#i;ic.<l  Kescarcb  Units. 
*  For  categories  see  pag*  *— . 


different  from  zero.  Tt  can  be  seen  readily  in  tabic  26.12,  however,  that 
those  students  who  cheek  waiver  1  (I  want  to  be  assigned  to  the  kind  of 
air  crew  training  in  which  I  am  most  interested  even  if  the  tests  show  that 
I  should  probably  fail  in  that  kind  of  training)  are  less  likely  to  be  elimi¬ 
nated  at  each  stanine  level. 

(4)  Relationship  of  strength  of  interest  and  first  preference  to  the 
stanines. — Studies  were  nude  to  investigate  the  relation  between  the 
stanines  and  preferences  and  waivers.  It  also  seemed  desirable  to  compute 
the  differences  in  mean  stanine  scores  for  students  expressing  different 
first  choices.  The  data  are  shown  in  tables  26.13,  26.14,  and  26.15. 


Tadi.e  26.13.—  Correlations  of  strength  of  interest  with  stanines  of  700  unclassified 

aviation  students''  * 


Stanine 

Interest  category 

Bombardier 

Navigator 

Pilot* 

-0.02 

0.19 

0.07 

-.14 

.29 

.03 

JMril  . 

-.12 

.09 

.05 

-  — r- 

t  product-moment  r  of  approximately  0.07  is  rtipiireii  for  significance  at  the  5  percent 
level  ami  of  0.10  at  the  I  pereent  level.  ...  , 

*  Corn-latums  in  iliix  column  a-e  biserials  contrasting  »lrength-of-tnterest  rating  of  9  siath 
all  others.  The  />  is  unknown. 


Taiii.f.  26.1*1. —  Mean  stanines  by  first  preferences  of  678  unclassified 
%  J.  *  filiation  students' 


_ ^  -  - 

>  ' 

K»r«*t  prffercnce 

Stanine 

noinlmriUer 

Navigator 

Pilot 

nnmfi.inlir  r 

481 

4.38 

' 

S.SS 

5.49 

5.30 

5.95 

5.13 

4.78 

5. SI 

Vfe,i„|  at  ’i’*»>tholi>Kic:»l  Krse.-trih  Unit  No.  3  iix  December  1943  anil  January  I94J. 


Tahi.k  26.15  —  Critical  ratios  of  differences  between  mean  stanines  of  students 
preferring  on.*  type  of  training  and  mean  stanine  of  students  preferring  all  other 
types,  based  upon  678  unclassified  aviation  students 


First  preference 

Stanine 

nomliartl'fr 

Navigator 

Tilot 

-3,94 

3.81 

-2.77 

ofitiui 

-S.50 

6.57 

-2.80 

-2.3? 

1.41 

-0.66 

non 


<t  •  .  V  _w.i 


■  .  ■»  • 


gt  7cr.1l  there  is  hut  slight  relation  betw-en  strength  of  interest  nml 
(he  stamne.  It  vy.ll  he  seen  in  table  26.13.  however,  that  the  strength  of 
n;  eus  or  navigator  training  has  significant  positive  correlations  with 
all  three  stanines.  especially  with  the  navigator  staninc.  First  preference 
li>.  Join  >an!ier  training  has  very  low  hut  significant  negative  correlations 
,  nav'£ator  ■'•nil  pilot  stanines.  Tallies  26.14  and  26.15  show  that 
students  whose  first  preference  is  for  navigator  training  have  higher 
average  stanines  in  all  three  specialties,  those  for  bombardier  and  navi¬ 
gator  aptitude  being  significantly  different  from  tbe  general  means  of 
.ill  other  students.  Students  whose  first  preference  is  for  bombardier 
training  have  significantly  lower  mean  stanines  in  all  three  specialties. 
(See  also  table  26.5.) 

(5)  Relation  of  staninc  validity  to  first  preference  and  preference 
iCait’e  r.  An  analysis  was  made  to  test  the  hypothesis  that  the  pilot 
stanine  would  have  higher  validity  for  those  whose  first  preference  is 
for  pilot  training.  The  results  are  reported  in  table  26.16.  Similar  results 
were  computed  for  the  preference  waiver  and  are  presented  in  table  26.17. 


Table  26.16.—  Diserial  correlations  of  pilot  staninc *  tii'f/i  graduation-elimination 


Fir.st  preference 

N 

Pilot  staninc 
validity 

Tlombardier  . 

0.43 

.49 

.43 

.46 

Navigator  . 

Navigator  or  bombardier  . . . . 

*  In  clasi  431'.  Tested  at  Psychological  Research  Unit  No.  3. 


Table  26.17. —  Disc  rial  correlations  of  the  pilot  staninc'  tciV/t  graduation- elimination 
from  primary  training  xvlten  pilot  trainees1  are  grouped  by  preference  xt-aiver 


Preference  waiver* 

N 

Pilot  itantne 
validity 

2.2M 

A  19 

2,4  32 

37 

2,8*2 

•  43 

291 

•.50 

*  Derived  from  the  classification  battery  of  December  1943. 

*  In  class  43K.  Tested  at  Psychological  Research  Unit  No.  3. 

*  For  categories  see  page  — . 

*  Significantly  different  from  0.37,  with  a  eriticit  ratio  of  2.6, 

•Significantly  different  from  0.39  and  0.37,  with  critical  ratios  of  3.3  and  2.5  respectively. 


It  can  be  seen  in  table  26.16  that  the  validity  of  the  pilot  staninc  is 
not  higher  for  those  whose  first  preference  is  pilot  training.  On  the 
contrary,  the  validity  of  the  pilot  staninc  is  slightly  higher  for  pilots 
whose  first  preference  is  for  navigator  training.  None  of  the  differences 
is  statistically  significant,  the  highest  critical  ratio  being  1.2.  In  tabic 
26.17,  it  may  be  seen  that  the  preference  waiver  has  a  rather  significant 
relationship  to  tbe  validity  of  the  pilot  staninc.  The  predictive  value  of 
the  stanine  is  highest  for  the  men  who  chose  preference  waiver  4. 
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(6)  Relationship  be frcee n  strcnyt.i  of  interest  and  classification-test 
scores. — To  ascertain  the  relation  between  tests  in  the  August  1942 
battery  and  the  intensity  of  motivation  of  aviation  students,  intercorrcla- 
tions  were  computed.  This  study  is  important,  because  certain  tests  may 
be  regarded  on  an  a  priori  basis  as  indirect  measures  of  interest.  Some, 
indeed,  were  designed  as  such  (c.  g.,  Techivcal  Vocabulary,  CE  505C). 
The  results  are  shown  in  table  26.18. 


Table  26.18. —  Correlations  of  strength  of  interest  with  tests  of  the  August  1942 
classification  battery ,  based  on  707  pilot  trainees *•  * 


Test  and  code  number 


Technical  Vocabulary  (Pilot),  CESOSC  ... 
Technical  Vocabulary  (Bomb.),  CESOSC 
Technical  Vocabulary  (Nay.),  CESOSC  . 
Speed  oi  Identification,  CPolOA  ....... 

Mathematics,  C1702E  . . . 

Numerical  Approximations,  CI706A  .... 

Reading  Comprehension,  AC10D  . 

Mechanical  Comprehension,  AC10D  .... 

Table  Reading,  CP621A  . 

Numerical  Operations,  C1702B  . 

Spatial  Orientation  I,  CPS01B  . . 

Spatial  Orientation  II,  CP503B  . 

Arithmetic  Reasoning,  CI206B . 

Dial  Reading,  CP622A  . ' . . 

Complex  Coordination,  CM701A  . 

Steadiness,  CM  103 A  . . . 

Finger  Dexterity,  CM116A  ............ 

Discrimination  Reaction  Time,  CP611D 


Interest  category 


)t» 

Bombardier 

Navigator 

0.02 

-0.10 

0.04 

-.04 

-.02 

.11 

-.07 

-.18 

.22 

-.02 

.00 

.07 

-.12 

-.20 

.34 

-.13 

-.01 

.21 

-.08 

-.08 

.20 

-.16 

-.18 

.03 

.06 

.02 

.14 

-.07 

.03 

.13 

-.07 

-.03 

.06 

-.03 

-.12 

.03 

.07 

-.04 

.19 

.01 

-.10 

.13 

-.02 

-.06 

.04 

-.06 

-.02 

.00 

.04 

.04 

.06 

.01 

.01 

.IS 

search  Unit  No.  3. 

*  A  product-moment  r  of  approximately  0.07  is  requir'd  for  significance  at  the  5  percent  level 
and  of  approximately  0.10  at  the  1  percent  level. 

•  Biserial  correlations.  A  biserial  r  of  approximately  O.lO  is  required  fpr  significance  at  the 
S  percent  level  and  of  approximately  O.IJ  at  the  1  percent  level. 


A  few  important  findings  should  be  noted  from  table  26.18.  Strength 
of  interest  for  pilot  training  shows  significantly  negative  relationships 
with  three  tests.  These  tests  are  Mechanical  Comprehension  (with  a  cor¬ 
relation  of  —0.16),  Mathematics  (—0.12),  and  Numerical  Approxima¬ 
tions  (—0.13).  Six  tests  show  significantly  negative  correlations  with 
bombardier  interest.  Strength  of  interest  for  navigator  training  has  sig¬ 
nificant  positive  correlations  with  eleven  tests.  Of  these,  Technical  Vocab¬ 
ulary  (navigator),  Mathematics,  Numerical  Approximation,  Reading 
Comprehension,  and  Arithmetic  Reasoning  are  the  tests  that  have  proved 
themselves  most  valid  for  success  in  navigation  training. 

The  Technical  Vocabulary  and  Information  Test  (CE505C)  was 
designed  as  an  interest  test,  with  three  i/itercst  scores,  each  for  one  of 
the  air-crew  specialties.  Correlations  of  fifst  preference  for  a  specialty 
with  the  score  for  that  specialty  are  negative  and  not  significantly  dif¬ 
ferent  from  zero  except  for  navigator  preference  and  navigator  score 
(see  table  26.19). 

If  the  expressed  specialized  preferences  are  good  criteria  of  specialized 
interest,  only  one  score  on  this  test  (navigator)  proves  to  be  valid  for 
the  purpose  intended.  There  is  other  evidence  (see  p.  817)  that  the  pilot 


Table  26.19. — Relation  of  first  /‘reference  to  Technical  Vocabulary  and  Information, 
_ CE505C,  for  unclassified  aviation  students' 


Score 


Pilot  .... 
Bombardier 
Navigator 


Buenal  correlationi,  the  dichotomy  being  thoje  whore  Hut  preference  ia  for  the  indicated 
*'•"*«*—  •*  those  whose  first  preference  it  for  all  other  positions. 

N,— 527,  P f— 0.91.  A  biserial  of  0.15  is  required  for  significance  at  the  5  percent  level  and 
of  0.20  at  the  1  percent  level.  Group  tested  in  July  194}  at  Psychological  Research  Unit  No.  3. 

*  N,“S30,  P f =0.95.  A  biserial  of  0.18  ia  required  for  significance  at  the  S  percent  level  and 
0.2J  at  the  1  percent  level  Group  tested  in  July  1943  at  Psychological  Research  Unit  No.  S. 

score  does  measure  pilot  interest  to  some  extent,  which  leads  us  to  suspect 
expressed  pilot  preference  as  a  criterion  of  pilot  interest. 

(7)  Preferences  in  relation  to  pilot  specialisation. — This  study  was 
part  of  a  larger  project  designed  to  study  the  problem  of  differentiating 
among  aptitudes  for  various  types  of  advanced  specialized  pilot  training. 
Assignment  to  specialized  training  to  an  appreciable  degree  was  a  matter 
of  preference. 

The  means  and  standard  deviations  for  the  strengths  of  preference* 
in  the  three  air-crew  specialties  are  given  in  table  26.20.  In  table  26.21 
critical  ratios  are  presented  for  the  differences  between  mean  strengths 
of  interest  of  students  in  different  types  of  advanced  training.  Two 
critical  ratios  indicate  differences  significant  at  or  beyond  the  5  percent 
level.  These  are  the  critical  ratios  of  the  difference  (1)  in  navigator 
training  interest  for  those  assigned  to  fighter  training  and  those  assigned 
to  heavy  bomber  training,  and  (2)  in  pilot  training  interest  for  those 
assigned  to  fighter  or  medium-bomber  training.  The  first  ratio  indicates 
that  pilots  who  are  assigned  to  training  on  heavy  bombers  express  more 
interest  in  navigation  than  those  who  become  fighter  pilots.  The  differ¬ 
ence  between  medium  and  heavy  bombardment  assignees,  though  not 
significant,  is  in  the  same  direction.  Assignees  to  fighter-pilot  training 
expressed  a  more  intense  desire  for  pilot  training,  in  general,  than  those 
students  given  medium-bomber  training.  The  same  trend  is  indicated 
between  fighter  and  heavy-bember  training. 


u 

* 

Fir»t  preference 

Pilot 

_ d _ 

Bombardier 

Narrator 

MM 

•-o.os 
•  •  •  • 

o  •  •  • 

O  0  o  • 

*0.53 

Tablx  26.20  —  Mean  strength  of  interest  for  trainee/  **  fighter,  medium  bomber, 

end  heavy  bomber  planes 


Intern!  category 

Figater  trarnieg* 

Medium 

bomber  training* 

*  I 

SO 

U 

SO 

Pitot . 

8.73  j 

0.71 

8.51 

1.07 

Bombardier  . . . 

5.14  1 

1.81 

J.3* 

1.84 

4.53  J 

2.00 

4.58 

2-14 

Heavy 

b— ber  training* 


8.83 

S.lt 

4.U 


SI 


IM 

t.M 

tl) 


Table  26.21. —  Critical  ratios'  of  differences  betzveen  mean  interest-strength  in  types 

of  advanced  pilot  training 


Category 

Fighter  v.  medium 
bomber  training 

Fighter  t\  heavy 
bomber  training 

Medium  v.  heavy 
bomber  training 

Bombardier  . . 

r. 

— 1.54 
-0.18 

*  2.12 

MON 
«-*  *0  OO 

I  1 

1  1 

©*-  — 

!Ok 

ONM 

Navigator  . . 

Pitot  . . 

1  pAaitino  '  7_ 

Variations  of  Preference  Blanks. — There  were  five  other  preference 
blanks,  two  of  which — Aviation  Cadet  Training  Preference  Blanks, 
CE501A  and  D—  -were  used  in  the  classification  battery  before  December 

1942. 

(1)  Aviation  Cadet  Training  Preference  Blank,  CE501A  and  D. 
— These  earlier  blanks  were  developed  to  allow  aviation  students  to  indi¬ 
cate  their  training  interest,  as  well  as  to  obtain  a  measure  of  intensity 
of  motivation. 

In  form  A,  the  student  is  asked  to  rank  his  preferences  from  1  to  11 
for  the  types  of  duties  listed.  This  list  includes:  armament  officer, 
borribardicr,  communications  officer,  engineering  officer,  gunner,  mechanic, 
meteorologist,  navigator,  photographer,  pilot,  and  radio  operator. 

In  form  D  the  three  types  of  air-crew  duties  are  listed,  and  the  students 
are  instructed  to  write  1  opposite  their  first  choice,  2  opposite  their 
second  choice,  and  3  opposite  their  third.  They  arc  required  to  mark  all 
three,  and  not  to  give  any  two  the  same  rank.  It  was  with  this  form  that 
the  preference-waiver  section  was  introduced.  Following  are  the  state¬ 
ments  presented  for  the  student’s  use  : 

1.  I  would  prefer  to  be  classified  for  the  type  of  duty  for  which  I  am  found  to 
have  most  aptitude,  even  though  it  is  not  the  same  as  the  first  preference  given 
above 

2.  I  would  prefer  to  be  classified  for  the  type  of  duty  for  which  I  am  found  to 
have  most  aptitude,  only  if  my  aptitude  for  this  type  of  duty  is  at  least  two  points 
higher  on  the  9-point  aptitude  scale  than  for  the  duty  for  which  I  expressed  first 
preference 

3.  I  would  prefer  to  be  classified  for  the  type  of  duty  for  which  I  am  found  to 
have  most  aptitude  only  if  my  aptitude  for  my  first  preference  indicates  that  I  am 
likely  to  be  eliminated  from  that  type  of  training.  (A  score  of  3  or  below.) 

I  would  prefer  to  be  assigned  to  the  type  of  training  for  which  I  indicated  first 
preference  above  without  regard  to  my  aptitude  Korea. 

The  difficulties  and  limitations  of  tbc  ranking-prcfcrcnce  technique 
and  of  the  wording  of  the  preference-waiver  section  were  discussed  on 
pages  723f. 

(2)  Aircrew  Preference  Rating  Scale t  CE503A  and  B.* — Tlie  cir¬ 
cumstance  that  prompted  the  development  of  these  scales  was  the  dissatis¬ 
faction  with  the  ranking  method  used  in  Training  Preference  Blank, 
CE501D. 


•  Developed  •(  Pird«l«fiul  Xeuudi  Unit  No.  I.  Chief  contributor:  Maj.  Frederick  Wick  eft. 


In  form  A,  students  arc  asked  to  compare  their  preferences  for  each 
type  of  air-crew  training  paired  with  every  other  type. 

The  items  for  form  B  were  taken  from  items  sealed  for  the  Navy.  The 
examinee's  task  is  to  check  in  the  column  headed  "Pilot”  those  state¬ 
ments  that  describe  his  feelings  about  pilot  training.  The  procedure  then 
is  repeated  under  the  columns  for  "Navigator”  and  "Bombardier.” 

Form  A  was  not  given  to  a  sufficiently  large  sample  to  make  possible 
the  computation  of  any  statistics.  This  form  lacked  the  advantage  of 
giving  an  independent  intcnsity-of-prcfercncc  score  for  each  of  the  three 
types  of  air-crew  duty.  The  principal  value  of  this  work  was  its  indication 
of  what  not  to  do  in  later  forms. 

Tabulations  of  the  data  for  form  B  indicated  that  practically  the  same 
items  were  checked  for  pilot,  navigator,  and  bombardier.  Differentiation 
of  interests  for  the  three  types  of  training,  therefore,  was  not  great 
enough  to  be  useful.  The  results  did  not  correspond  with  informal  im¬ 
pressions  of  the  degree  to  which  preferences  for  the  three  types  of  train¬ 
ing  were  differentiated  in  the  minds  of  the  students.  Since  the  score  ob¬ 
tained  for  each  of  the  three  types  of  training  appeared  to  be  so  meaning¬ 
less,  no  validation  was  attempted. 

(3)  Aviation  Cadet  Preference  Scale ,  CE509A. — This  is  a  graphic 
rating  scale,  and  it  is  the  last  variation  of  the  blank  designed  to  secure 
preferences  for  training.  It  consists  of  a  horizontal  line  divided  into  11 
spaces  by  dots  placed  at  equal  intervals  along  the  line.  One  end  of  the  line 
is  labeled  "Dislike  intensely,”  the  center  "Indifferent,”  and  the  other  end 
“Like  intensely.”  No  other  descriptive  comments  are  used.  The  student  is 
asked  to  draw  a  short  vertical  line  through  a  dot  and  write  B,  N,  and  P 
(for  bombardier,  navigator,  and  pilot)  on  the  line  that  represents  how  he 
feels  about  each  type  of  training. 

This  scale,  modified  in  accordance  with  suggestions  from  the  Office 
of  the  Air  Surgeon  and  from  Psychological  Research  Unit  No.  3,  re¬ 
sulted  in  a  new  form  called  the  Aviation  Cadet  Training  Preference 
Blank,  CE501E,  which  was  accepted  for  use  in  the  classification  battery 
and  was  described  above. 

Training  Preference  Blank,  CE513A 

This  blank  was  developed  in  order  to  measure  varying  degrees  of 
interest,  not  in  types  of  training,  but  in  types  of  airplanes,  combat  and 
noncombat.  It  is  comparable  to  the  blank  filled  out  in  basic  training  to 
aid  in  the  assignment  of  graduates  to  various  types  of  advanced  training. 
The  hypothesis  was  that  such  a  blank  would  be  valid  for  the  graduation- 
elimination  criterion  in  primary  and  other  phases  of  training. 

Description . — Nine-point  scales  like  those  appearing  ;r.  the  Aviation 
Cadet  Training  Preference  Blank,  CE501E,  arc  utilized  in  this  blank 
(sec  fig.  26.1). 
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(1)  Internal  characteristics. — The  first  section  consists  of  seven  scales, 
listing  various  types  of  planes :  trainers,  trar.aports,  fighters  (single  and 
twin-engine),  and  bombers  (both  medium  and  heavy).  The  examinee 
is  instructed  to  encircle  the  number  which  corresponds  to  his  interest, 
ranging  from  1  ("little  or  no  interest”)  to  9  ("exceptionally  strong 
interest”). 

In  section  two,  the  student  is  presented  with  a  list  of  the  seven  planes, 
which  he  is  to  rank  according  to  his  preferences  from  1  to  7. 

In  section  three,  another  graphic  scale  is  presented,  and  the  student 
is  asked  to  circle  the  number  which  represents  how  disappointed  he  would 
be  if  he  had  to  learn  to  fiy  a  type  of  plane  not  among  his  first  three  choices. 

(2)  Administration. — The  approximate  administration  time  for  the 
blank  is  10  minutes,  but  no  time  limit  is  set. 

Statistical  results. — (1)  Distribution  statistics.  Typical  examples  of 
distribution  statistics  obtained  on  this  blank  are  given  in  table  26.22. 


Table  26.22. —  Distribution  constants  for  Training  Preference  Blank ,  CE513A,  based 
on  1,130  classified  pilots  entering  primary  training* 


1 

Type  of  plane 

M 

SD 

4.55 
4.8$ 
*.97 
6.27 
5.91 
7.1 1 

2.22 

1.97 

2.01 

1.74 

1.88 

2.11 

6.27 

2.14 

4.38 

1.86 

1  In  cUm  44  K.  Tested  at  Psychological  Research  Unit  No.  3. 


(2)  Test  validity. — This  blank  was  administered  for  validation  to  | 
classified  pilots  just  prior  to  entrance  into  primary  training.  Validation  ; 
data  for  strength  of  preference  (on  the  9-point  scale  for  each  of  the 
various  types  of  planes)  arc  presented  in  table  26.23. 


Table  26.23. —  Relation  of  strength  of  preference  for  different  types  of  planes  to 
graduation-elimination  of  pilots  from  primary  training *  (Nt—lJ.30,  p,—Q.90) 


Type  of  plane 

M# 

M 

* 

SD, 

r»u 

.rn. * 

Trainer  . 

3.56 

3.40 

2.22 

0.03 

0.03 

Transport  . 

3.86 

3.76 

1.97 

.02 

.02 

Twin-engine  fighter  . . . 

6.02 

5.38 

2.ul 

.15 

.17 

Medium-bomber  (B — 2S>  . 

5.28 

5.07 

1.74 

.06 

.05 

Medium  bomber  (B-26)  . 

4.9J 

4.62 

1.88 

.08 

.07 

Single-engine  fighter  . 

6.15 

5.59 

2.11 

.13 

.11 

Kour-en^ttte  bomber  . 

5.28 

5.08 

2.14 

.05 

.03 

Disappointment  scale  . . 

3.40 

3.05 

1.86 

.09 

.10 

-  til  vitioa  in  .  itairu  *  a/viiv  »’v«  «** 

*  A  Use  rial  correlation  of  approximately  0.10  is  required  for  significance  at  the  S  percent  level 
amt  ol  approximately  0.1  3  at  the  1  percent  level 
1  Assuming  an  unrestricted  jtanine  standard  deviation  of  ZOO. 


Evaluation. — From  the  table  of  correlations,  it  can  be  seen  that  ex¬ 
pressed  interest  for  nly  two  types  of  planes,  single-engine  and  twin- 
engine  fighters,  shows  a  significant  relation  to  graduation-elimination 
from  primary  training.  All  coefficients,  however,  are  positive.  As  for  all 
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interest  ratings,  these  correlations  arc  attenuated  by  individual  constant 
errors,  but  the  extent  of  attenuation  is  an  open  question. 

Evaluation  of  Air-Crew  Preference  Blanks 

It  was  originally  hoped  that  the  strength-of-intcrcst  scale  and  the 
preference  waiver  would  indicate  intensity  of  motivation.  Interview 
results,  however,  showed  that  many  other  factors  enter  into  the  student’s 
decision  as  to  which  descriptive  category  to  check  and  which  statement 
to  mark.  The  following  arc  typical : 

(1)  Many  students  seem  to  distort  their  ratings  of  strength  of  interest 
in  order  to  influence  the  classification  board.  An  example  would  be  a  stu¬ 
dent  giving  a  very  high  preference  for  one  position  (such  as  ‘'9”  for 
pilot),  and  placing  the  other  two  very  low  ('T”  for  bombardier  and  "2" 
for  navigator)  on  the  scale. 

(2)  Many  students,  perhaps  in  fear  of  being  grounded,  signify  their 
preferences  by  circling  “9”  for  all  categories. 

(3)  Students  tend  to  circle  numbers  that  have  words  under  them,  i.  e., 
the  odd  numbers  (see  fig.  26.1). 

(4)  Being  unable  to  understand  what  the  statement  means,  the  student 
may  mark  a  statement  because  he  is  influenced  by  irrelevant  factors, 
such  as  the  feeling-tone  imparted  by  the  wording  of  the  statement. 

(5)  Marking  statements  in  the  preference-waiver  section  often  indi¬ 
cates  whether  the  student  has  faith  in  the  tests  or  not,  thereby  measur¬ 
ing  his  attitude  toward  psychological  tests  as  well  as  (or  rather  than)  his 
motivation. 

(6)  Many  students  seem  to  mark  the  waivers  “1”  or  “2”  because  of  a 
sense  of  duty  or  patriotism  in  doing  what  the  Army  wants  them  to  do. 

(7)  Many  students  mark  waivers  “1”  or  "2”  because  of  a  feeling  that 
the  classifying  officer  will  be  prejudiced  against  them  if  they  mark  state¬ 
ments  “3”  or  “4." 

(8)  Some  students  seem  to  feel  that  the  four  waiver  statements  con¬ 
stitute  a  disguised  personality  test  of  some  sort.  The  students  feel  that 
they  will  be  considered  indecisive  or  lacking  in  a  knowledge  of  their  own 
desires  if  they  mark  statement  "1,”  and  that  they  may  be  considered 
stubborn  if  they  mark  statement  “4.” 

There  are  some  grounds  for  the  belief  that  the  preference  waiver 
should  be  given  at  the  end  of  the  testing  sessions  rather  than  at  the 
beginning.  After  the  students  have  had  some  experience  with  the  tests, 
they  feci  better  qualified  to  state  whether  they  desire  to  be  classified 
according  to  the  test  results. 

There  are  a  few  general  conclusions  and  group  tendencies  that  can  be 
noted  in  an  examination  of  the  statistical  data.  In  biicf,  they  are: 

( I )  Those  students  who  prefer  pilot  training  are  most  likely  to  suc¬ 
ceed  in  graduating  from  primary  training  (but  this  does  not  apply  to 
basic  training). 
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(2)  Those  whose  first  choice  is  bombardier  have  the  lowest  average 
stanincs  for  all  three  air-crew  positions. 

(3)  Those  whose  first  choice  is  navigation  training  have  the  highest 
average  stanincs  tor  all  three  air-crew  assignments.  Conversely,  those 
who  put  navigator  last  in  their  order  of  preferences  have  the  lowest 
average  stanincs. 

(4)  The  validity  of  the  pilot  stanine  is  not  higher  for  those  whose  first 
choice  is  pilot  training. 

(5)  Strength  of  interest  for  navigation  training  has  significant  posi¬ 
tive  correlations  with  classification  tests  that  have  high  validity  with  the 
navigator  criterion,  while  strength  of  interest  for  either  pilot  or  bombar¬ 
dier  training  docs  riot  correlate  well  with  any  tests. 

Because  navigator  interest  is  significantly  related  to  the  navigator 
stanine,  success  in  navigator  training,  and  the  classification  tests,  there 
is  reason  to  believe  that  students  who  prefer  navigator  training  have 
superior  insight  into  their  abilities  and  temperament. 

The  strong  emphasis  placed  upon  the  student’s  preferences  in  making 
recommendations  for  air-crew  training  is  not  warranted  by  the  empirical 
evidence.  This  is  not  to  say  that  the  information  yielded  by  the  preference 
blank  was  not  valuable  to  those  recommending  air-crew  classifications. 
The  conclusion  is  undeniable,  however,  that  self-assessed  preferences  have 
very  low  validity  for  predicting  success  in  air-crew  tiaining. 

ATTITUDE  AND  INTEREST  INVENTORIES 

Various  types  of  instruments-  personal  inventories,  situation  tests, 
check  lists,  and  the  like — were  used  to  obtain  expressions  of  specific 
likes  and  aversions.  This  constitutes  an  indirect  and  more  objective 
method,  as  contrasted  with  the  direct  and  less  objective  method  of  general 
self-assessment.  It  was  assumed  that  these  specific  preferences  would 
form  a  pattern  and  serve  as  a  basis  of  trainee  selection. 

Satisfaction  Test,  CE109A  * 

This  test  was  developed  on  the  basis  of  certain  hypotheses  concerning 
interest-trends  in  the  various  air  crew  specialists.  The  successful  com¬ 
bat  pilot  was  assumed  to  be  extroverted;  the  navigator  was  assumed  to 
lie  characterized  by  sedentary  and  scientific  interests;  and  the  bombardier 
was  assumed  to  be  characterized  by  aggressive-destructive  tendencies. 

Description. — A  purely  empirical  approach  to  the  measurement  of 
air-crew  personality  was  adopted  in  developing  this  inventory.  A  collec¬ 
tion  was  made  of  verbally  described  personality-revealing  situations  by 
studying  "Information  Essays"  written  by  students  in  connection  with 
the  development  of  Technical  Vocabulary  and  Information  Test 
(CE505A)'  and  by  ulying  job  analyses  containing  personality  data. 

*  Developed  at  r«»ctiolo(;ical  Research  Unit  STo.  1.  Chief  contributor*:  Tecb./Sft  Robert  F. 
Blake,  Capt.  Donald  K.  Super,  Staff/ Set.  John  l_  Wallen. 

*  See  chapter  14  for  •  dixuuion  of  thit  te*L 
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Preliminary  research  on  expressions  of  likes  and  dislikes  of  aviation 
students  involved  the  preparation  of  two  questionnaires.  In  Questionnaire 
f,  the  examinees  were  given  descriptions  of  a  numlxT  of  situations  in 
which  soldiers  frequently  find  themselves.  The  examinees  were  asked 
to  write  briefly  what  they  would  do  in  each  rase.  In  Questionnaire  II 
the  examinees  were  asked  to  write  down  those  features  of  military  life — 
different  from  civilian  life — which  they  liked  and  those  which  they  dis¬ 
liked.  A  statement  of  the  number  of  m  jnths  the  examinee  had  been  in 
active  service  was  also  obtained.  In  order  to  secure  complete  frankness 
on  both  questionnaires,  the  examinees  were  told  that  this  information 
would  have  no  bearing  upon  their  classification,  and  they  were  told  not  to 
sign  their  names  on  their  papers. 

.  The  data  obtained  from  the  second  questionnaire  furnished  material 
for  part  I  (Aviation  Cadet  Likes)  and  part  II  (Aviation  Cadet  Dislikes) 
of  the  Satisfaction  Test,  CE409A.  It  was  found  that  "Likes”  result  from 
personal  care  and  privileges  (food,  regular  schedules,  uniforms,  equip¬ 
ment,  etc.)  ;  social  relations  (comradeship,  discipline,  social  uniformity, 
etc.)  ;  ana  personal  values  (educational  opportunities,  chance  for  ad¬ 
vancement,  freedom  from  worry  about  present  or  future,  exactness, 
precision,  promptness,  etc.).  "Dislikes”  result  from  physical  inconven¬ 
iences  (food  and  method  of  serving,  lack  of  lights,  mud,  crowded  or 
inadequate  toilets  and  washing  facilities,  etc.)  ;  duties  or  lack  of  privileges 
(getting  up  early,  latrine  duty,  K.  I*.,  guard  duty,  not  enough  free  time, 
etc.)  ;  personal  frustrations  (loss  of  individuality,  harsh  authority,  being 
away  from  home  and  friends,  etc.)  ;  inefficiency  (organization  waiting  in 
line,  lack  of  knowledge  about  the  future,  slowness  to  get  action  on  matters 
of  personal  importance,  etc.) ;  and  social  relations  (other  cadets  con¬ 
sidered  undesirable,  lack  of  feminine  companionship,  etc).* 

A  few  examples  of  the  types  of  items  in  part  I  and  II  arc: 


As  a  cadet,  I  would  get  more  satisfaction  from: 

(A)  Everybody  being  on  equal  terms,  or 

(B)  Getting  regular  medical  care. 

(A)  Security  and  freedom  from  worry  about  the  present,  or 
C  B)  The  chance  for  a  career. 

(A)  The  chance  for  a  career,  or 

(B)  Good  pay. 

As  a  cadet,  I  would  be  more  irritated  by : 

(A)  Unfair  or  harsh  orders,  or 

(B)  Iking  away  Iron  home  and  loved  ones. 

(A)  Too  much  regimentation,  or 

(B)  Lack  of  recreational  facilities. 

(A)  Lack  of  personal  privacy,  or 

(B)  Lack  of  feminine  companionship. 


•  It  i*  «ntere»rin«  l*  note  ike  d.ltererwee  between  new  mew  »mJ  lk<w*  twwer  to  ike  **■ 
iee  It  *£K>e*re»l  tk»t  (I)  imtiTiitmt  J.ltrreneei  in  sre  Urfel?  eVm»«»ted  wtifc  inters* 

m,t,t*ir  vTper.rnce*.  »«  ik*t  *11  men  {ritluilly  «•  d.U.ke  «e»y  >*»*»«/  ike  »*<*>*  ikutet; 

nerv»n*l  (ttttli*i>oas  pt*y  «  more  .mporunt  p*n  *»  ieneik  •(  *er»*e  intrr*.*»;  *nd  <J) 
>ener*t  *re*  it  *oci*lfeU«io«»k»|»»  Vetomr*  ot  intre»*in*lr  »rf*ter  import*  nee  »;tk  ,imf 
STSriSI  imi5l  ••  «k«  omrrce  berk  of  ok.locle*  *n4  •(  *4  U  KtrnnW  sdjiwtmeot 

tm;  Hfi  • 
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The  data  obtained  from  Questionnaire  I  proved  very  useful  in  con- 
struc'ing  items  for  part  III,  '\y<ht  opinions,”  of  the  satisfaction  test. 
This  part  is  composed  of  a  number  of  described  situations,  each  followed 
by  descriptions  of  five  alternative  responses.  The  five  alternatives  were 
chosen  in  accordance  with  six  assumed  modes  of  reacting  to  a  problem 
situation:  rationalization,  react  ion- formation,  compensation,  introversion 
(submission),  extraversion  (ascendance),  and  ncuroticism. 

It  was  decided  that  there  would  be  no  alternative  referring  to  an  inte¬ 
grated  manner  of  behaving,  since  it  seemed  probable  that  it  would  appear 
to  the  examinees  to  be  the  obviously  right  answer.  It  was  expected  that  the 
examinee  wcuid  either  ( 1 )  consistently  choose  alternatives  of  a  given 
kind,  revealing  behavior  which  is  characteristic  of  himself,  or  (2) 
choose  alternatives  revealing  various  modes  of  reaction  so  that  his  score 
for  any  one  mode  of  reaction  would  be  moderate.  The  latter  result  was 
thought  to  identify, the  integrated  person.  A  typical  part  III  item  is; 

If  a  fellow  is  not  genuinely  admired  by  the  men  in  bis  squadron,  he  will  most 
probably : 

A.  Feel  that  the  other  men  are  not  the  kind  that  he  cares  to  be  admired  by 

and  make  no  effort  to  gain  their  friendship. 

B.  Try  to  be  the  best  soldier  in  the  bunch  in  order  to  show  them  up. 

C.  Say  that  someone  who  is  jealous  of  him  is  passing  rumors  about  him. 

D.  Feel  glad  because  he  gets  more  satisfaction  from  being  on  his  own  than 

from  being  one  of  the  gang. 

E.  Feel  hurt  by  their  coolness  and  try  to  boss  them  around. 

In  the  item  above,  A  would  be  the  rationalization  response,  B  the 
compensation,  C  the  neurotic,  D  the  submissive-introverted  reaction,  and 
F.  the  ascendant-extroverted  reaction. 

Tart  IV,  Preferences,  contains  paired  comparisons  of  civilian  activities 
(sports  and  hobbies)  and  flight  assignments  assumed  to  be  related  to 
personality  traits.  Following  arc  examples  of  items  appearing  in  Part  IV. 

If  given  the  choice  and  having  equal  opportunity  and  ability,  would  you  rather: 

(A)  oo  to  a  carnival?  or 

(B)  Gc  to  an  opera? 

fA)  Play  cards?  or 

(B)  Pby  football  ? 

(A)  Be  a  *rgeant  on  active  flying  duty?  or 

(B)  Be  a  i  eutenant  with  a  desk  job? 

0)  r*  tenial  characteristics. — In  final  form  the  test  consists  of  150 
items  divided  among  the  four  parts  just  described.  Part  I  contains  30 
items ;  part  II,  30;  part  III,  45 ;  and  part  IV,  45. 

(2)  Adminislration. — Before  each  part,  a  brief  introductory  state¬ 
ment  is  made.  For  parts  I,  II,  and  IV  the  examinee  is  told  to  judge 
each  pa»r  independently  of  the  other  pairs,  since  any  alternative  may 
occur  several  \  lies,  paired  with  a  different  alternative  each  time.  The 
examinee  is  instructed  that  if  little  preference  is  felt  for  either  alternative, 
he  is  to  choose  the  less  objectionable  activity. 
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l  or  part  III,  the  examine  ■  is  insti  ucted  to  choose  the  alternative  which 
most  closely  corresponds  t<)  his  own  opinion  about  each  particular  situ¬ 
ation  or  activity.  If  none  of  the  statements  exactly  expresses  the  opinion 
of  the  examinee,  he  is  instructed  to  choose  the  one  that  is  the  closest  to 
his  opinion.  Even  if  several  choices  could  express  the  opinion  of  the 
examinee,  he  is  instructed  to  select  only  the  best  one. 

The  total  testing  time  for  part  I  is  10  minutes;  for  part  II,  10  minutes; 
for  part  III,  16  minutes;  and  for  part  IV,  10  minutes. 

The  test  is  paced,  with  the  administrator  interrupting  the  testing  to 
ask  for  a  show  of  hands  of  those  who  had  finished  part  I  at  the  end  of  7 
minutes,  part  II  at  the  end  of  7  minutes,  part  III  at  the  end  of  12  minutes, 
and  part  IV  at  the  end  of  7  minutes. 

(3)  Scoring. — The  papers  were  first  scored  on  two  a  priori  keys  pre¬ 
pared  by  the  test  constructors,  one  designed  to  measure  atypicality  of 
attitude  and  one  designed  to  measure  morale.  Since  these  keys  had  no 
validity,  a  prior;  keys  were  abandoned  in  favor  of  empirically  derived 
keys. 

Following  two  item  analyses  based  on  random  halves  of  a  total  group 
of  1,595  cases,  using  as  the  criterion  graduation-elimination  from  primary 
pilot  training,  two  empirical  scoring  keys  were  prepared  for  cross- 
validation.  All  responses  that  were  made  by  significantly  different  per¬ 
centages  of  graduates  and  climinccs  (significant  at  the  10  percent  level 
or  better)  were  examined  to  discover  whether  the  items  could  be  de¬ 
fended  psychologically  as  well  as  statistically.  On  these  bases,  two  scoring 
keys  were  constructed. 

The  scoring  formula  is  R — VV  +  40,  in  which  R  refers  to  the  positively 
weighted  responses  and  W  *.o  the  negatively  weighted. 

Statistical  results. — The  data  that  follow  arc  for  examinees  tested  in 
October  1942  at  Psychological  Research  Unit  No.  1. 

(1)  Distribution  statistics. — -Typical  examples  of  distribution  statistics 
obtained  on  this  test  are  given  in  table  26.24. 

Tabu  2£ 24.— Distribution  constants  for  Satisfaction  Tc\t,  CE4CRA.  for  filots  in 

prim.Ty  training 


N 

u 

$0 

7C7 

40.0 

r%: 

46.4 

(2)  Test  validity. — Ba^cd  on  the  two  a  priori  keys,  one  to  measure 
atypicality  of  attitude  and  one  to  measure  morale,  a  sample  of  787  class’’ 
ned  pilots  yielded  biscria!  correlations  with  graduation-elimination  in 
primary  pilot  training  of  —  0. 1 2  and  —  0.0 1  respectively.  Cross-validation 
data  for  the  two  empirically  developed  keys,  however,  are  much  more 
satisfactory.  The  data  appear  in  tabic  26.25. 
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Table  26.2S. —  Cross-validation  data  for  Satisfaction  Test,  C El 07 A,  based  on  pilots 
in  primary  training,  using  graduation-elimination  as  the  criterion 


Croup 

. 

Key 

N, 

- 

M, 

M, 

SD, 

h.  » 

0-0 

t-*v 

Q.S8 

.58 

40.8 

47.3 

39.0 

45.4 

6.3 

57 

'"ddi  . 

F.venj  ..... 

Evaluation. — The  a  priori  keys  showed  no  validity  for  primary  pilot 
training.  The  empirical  keys,  however,  developed  on  the  basis  of  item- 
validity  coefficients,  yielded  validities  of  approximately  0.2  upon  cross- 
validation.  The  validity  remained  at  about  0.2  in  subsequent  administra¬ 
tions  of  two  revisions  of  the  test  (see  below).  This  validity  and  the  low 
correlations  with  the  pilot  stanine  indicate  that  the  test  would  be  expected 
to  add  moderately,  to  the  validity  of  the  classification  battery. 

The  item-analysis  data-  indicated  that  the  rationale  back  of  the  con¬ 
struction  of  this  test,  as  far  as  the  pilot  is  concerned,  was  promising  and 
that  there  was  a  need  for  further  item  revision  and  item  writing.  On  the 
strength  of  the  results,  the  test  was  revised  and  two  new  forms  prepared. 

Satisfaction  Test,  CE409B  and  C  T 

These  two  forms  are  revisions  and  extensions  of  CE409A. 

Description. — The  items  for  Forms  B  and  C  were  selected  from  parts 
III  and  lV  of  Form  A  of  the  test  on  the  basis  of  the  item  validities  re¬ 
vealed  in  the  previously  discussed  analyses.  The  criterion  for  selection 
was  a  ietrachoric  correlation  of  0.08  or  more  with  success  in  primary 
pilot  training,  which  is  at  the  10  percent  level  of  significance  or  better. 
In  addition  to  these,  new  items  were  constructed  according  to  the 
principles  which  seemed  to  undcrly  the  types  of  items  already  demon¬ 
strated  to  be  valid. 

(1)  Internal  characteristics. — Forms  B  and  C  consist  of  85  paired- 
comparison  items,  25  in  part  I  and  60  in  part  II.  In  Form  B,  the  examinee 
is  required  to  make  a  choice  between  the  two  alternatives.  Form  C,  how- 
cver,  which  contains  the  same  items  as  Form  B,  permits  a  third  response 
— “Neither  of  these.”  Items  in  part  I  are  prefixed  with  different  premises, 
such  as  the  two  items  below.  Asterisks  indicate  the  alternatives  with  posi¬ 
tive  pilot  validity. 

Other  things  being  equal,  actual  parachute  jumping  would  be: 

•A.  A  thrilling  experience. 

B.  Good  because  it  is  so  necessary  for  fliers. 

C  Neither  of  these.* 

As  a  cadet,  I  would  expect  to  get  more  satisfaction  because : 

*A.  I  would  like  to  fly. 

B.  Fliers  are  badly  needed. 

C  Neithc  of  these.* 

» Developed  at  rsjrcbotonica!  Research  Unit  No.  I.  Chief  eontributora:  Tech./Sgt,  Egbert 
R.  Elate,  Capt.  Donald  E.  Super,  Staff/Sgt.  John  L,  Wallen. 

•These  alternative!  appear  only  in  form  C. 
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All  items  in  part  II  arc  prefixed  with  the  same  premise,  such  as: 

If  given  the  choice  and  having  equal  opportunity  and  ability,  would  you  rather: 

A.  Do  high  level  precision  bombing? 

*B.  Do  dive  bombing? 

C.  Neither  of  these.* 

•A.  Practice  aerobatics? 

B.  Practice  instrument  flying? 

C.  Neither  of  these.* 

/ 

(2)  Administration. — The  total  testing  and  administration  time  for 
Form  B  or  C  is  24  minutes.  Parts  I  and  II  arc  not  timed  separately. 

(3)  Scoring. — The  scoring  formula  for  Forms  B  and  C  is  R— W-F20, 
in  which  R  refers  to  positively  weighted  responses  and  W  to  negatively 
weighted  responses,  as  determined  by  item  analyses.  The  constant  is 
added  to  eliminate  negative  scores. 

Statistical  results.  (1)  Distribution  statistics. — Typical  examples  of 
distribution  statistics  obtained  for  Satisfaction  Test,  CE409B  and  C, 
are  given  in  table  26.26. 


Table  26.26. —  Distribution  constants  for  Satisfaction  .\st,  CE409B  and  C  based 

on  pilots  in  primary  training1 


Form 

N 

U 

SD 

B  . 

739 

12.0 

4.6 

B  . 

740 

7.2 

4.2 

C  . 

566 

6.2 

6.0 

C  . 

329 

9.6 

6.3 

and  September  1943. 


(2)  Reliability  coefficient. — The  internal  consistency  of  the  test  is 
indicated  by  the  Kuder- Richardson  (Formula  No.  21)  coefficients  in 
table  26.27. 


Table  26.27. —  Estimated  reliability  coefficients  (Kuder-Richordson  Formula  No.  21) 
for  Satisfaction  Test,  CE409B  and  C,  based  on  pilots  in  primary  {raining* 


B 
B 
C 
C 

1  Same  temples  as  in  table  26.26. 


(3)  Item  validity—  After  dividing  the  sample  of  1,479  answer  sheets 
(the  first  two  samples  of  table  26.26)  into  two  random  halves,  the  re¬ 
sponses  to  the  items  in  each  sample  of  Form  B  were  correlated  with  the 
graduation-elimination  criterion  from  primary  pilot  training  in  order 
to  develop  an  empirical  key.  The  same  procedure  was  followed  with  the 
1,095  answer  sheets  (the  last  two  samples  of  table  26.26)  of  Form  C. 
The  distributions  of  the  tctrachoric  r’s  are  shown  in  table  26.28.  Items 


•These  alternative*  appear  only  la  form  C 
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arc  not  included  in  this  distribution  when  one  response  was  chosen  by 
more  than  90  percent  or  less  than  10  percent  of  the  examinees. 


Table  26.28. — Tetrachoric  r  distribution  for  items  in  satisfaction  test,  CE409B  and 
C,  based  on  odds  and  evens  samples  of  pilots  in  primary  training,  using  graduation- 

elimination  as  the  criterion 


Form  B* 

Form  C 

Tetrachoric  r 

'  * 

f 

Evens* 

Odds* 

Evens* 

Odds* 

0.28-0.32 

1 

2 

0.21-0.27 

1 

1 

0.18-0.22 

2 

4 

4 

6 

0.13-0. 17 

6 

7 

14 

14 

0.08-0.1? 

20 

20 

19 

17 

0.03-0.07 

27 

23 

33 

34 

-0.02-0.02 

17 

26 

31 

46 

(— 0.07)-(— 0.03) 

27 

23 

37 

23 

1— 0.12)-?  — 0,08) 

20 

20 

18 

20 

l—0.17)-i—0.13j 

6 

7 

n 

11 

?— 0.22)-(  — 0.18) 

4 

10 

6 

(— G.27)-(  — 0.23) 

4 

2 

Tout 

127 

134 

lb3 

182 

1  Note  that  this  is  the  2-choice  form  of  the  teat. 

»  N, =740,  Pf-0.77. 
tN,= 739,  Pt= 0.79. 

**,=529.  F#=0.80. 

•AT, =566,  F#=0.79. 

In  interpreting  these  tetrachoric  r's,  it  can  be  said  that  for  the  N  of 
approximately  750  an  rtei  of  0.07  is  significant  at  the  5  percent  level,  and 
an  rut  of  0.09  is  significant  at  the  1  percent  level  of  confidence.  For  the 
N  of  approximately  550,  an  rt,t  of  0.09  is  significant  it  the  5. percent 
level,  and  an  rtet  of  0.11  is  significant  at  the  1  percent  level  of  confidence. 
In  the  odds  sample  of  Form  B,  31  questions  yield  which  exceed 
the  5  percent  level  of  significance  and  23  questions  which  exceed  the  1 
percent  level.  In  the  evens  sample  of  Form  B,  the  corresponding  figures 
are  33  and  24.  In  the  odds  sample  of  Form  C  ( ?7  r<„-,  exceed  the  5  per¬ 
cent  level  of  significance  and  53  exceed  the  1  percent  level.  In  the  evens 
sample  of  Form  C,  tne  corresponding  figures  are  72  and  58. 

(4)  Cross-validation  data. — Cross-validation  results  using  the  keys 
based  on  these  it^ni  validations  are  given  in  tables  26.29  and  26.30.  The 
same  procedure  was  followed  for  Form  C.  The  data  appear  in  tables 
26.29  and  26.30. 


Table  26.29. —  Cross-vaiidution  data  for  Satisfaction  Test,  CE409B,  for  tamf’ts  of 
pilots  in  primary  training ,  bated  on  graduation-elimination  criteuon 


Sample 


I  (odds)  . 

II  (everj) 

III  . 

ill . 


Key  baird  on 
sample 

Scoring 

formula 

N, 

M. 

SD, 

'»»• 

II  (evens)  ... 
I  (oddti  .... 

I  and  11  . 

I  and  11  .... 

R-W+20 
R-W+20 
Rights  ... 
Wrong*  . . 

*739 

*740 

•1,475 

•1,475 

0.79 

.77 

.79 

.79 

12.69 

7.85 

19.01 

8.10 

11.37 

6.42 

17.48 

9.4) 

4.60 

4.24 

3.81 

3.75 

0.18 

.20 

.23 

-.20 

an 

.2* 

a* 

-M 

*  VirrCClCu  to  •u  UIUCSUIVICU  inline  imiimiu  uwsvwvm  V»  a.w, 

*  Same  samples  as  in  table  26.26. 

f*  New  sample  tested  st  Psychological  Research  Unit  No.  1  in  August  and  September  1WX 
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1  V  ; 

'  *1 


■  A 


Table  26.30. —  Cross-validation  dc'a  for  Satisfaction  Test,  CE409C,  for  samples 
of  pilots  in  Primary  training,  based  on  graduation-elimination  criterion 


Sample 

Key  based 
on  sample 

Scoring 

formula 

N, 

“a 

SD, 

fl4# 

.rHa* 

I  (odds)  .. 

II  (evens)  . 

R  — W+20 

•566 

.79 

7.03 

5.17 

6.03 

0.18 

0.19 

11  (evens) 

1  (odds)  . . . 

R-W  +  20 

*529 

80 

10.48 

8.60 

6.32 

.17 

.18 

III  . 

I  and  II  ... 

Rights  ... 

*1,399 

.80 

18.47 

16.72 

4.26 

.23 

.25 

Ill . 

1  rftrv.r,a 

I  and  11  ... 

A  i.  _ _ 

Wrongs  . . 

*1,399 

.80 

8.84 

10.69 

3.98 

-.26 

-.28 

*  Same  sample*  at  in  table  26.26. 

*  New  sample  tested  at  Psychological  Research  Unit  No.  t 


in  August  and  September  1943. 


The  scoring  key  used  lor  sample  III  is  the  combined  key  developed 
from  the  item  analysis  of  the  odds  and  evens  sample.  Because  Forms  B 
and  C  are  for  all  intents  and  purposes  the  same  test,  it  was  possible 
to  utilize  the  four  existing  item-validation  studies  in  making  a  final  key. 
This  key  consists  of  38  positively  weighted  alternatives  and  36  negatively 
weighed  alternatives.  The  scoring  formula  is  R— W+20,  in  which  R 
refers  to  positively  weighted  responses  and  W  the  negatively  weighted 
responses. 

(5)  Use  of  satisfaction  test  for  pilot  specialisation. — An  item  analysis 
was  completed  for  this  test,  using  as  the  criterion  the  ratings  given  to 
pilots  on  general  pilot  ability.*  Forty-two  bomber  pilots  who  were  above 
average  in  general  pilot  ability  in  transition  schools  and  97  fighter  pilots 
who  were  above  average  in  single-engine  advanced  schools  were  used. 
In  interpreting  the  phi  coefficients,  it  can  be  said  that  for  an  N  of  150, 
a  phi  of  0.16  is  significant  at  approximately  the  5  percent  level,  and  a 
phi  of  0.21  is  significant  at  the  1  percent  level  of  confidence.  In  this 
sample,  11  items  exceed  the  5  percent  level  of  significance  and  7  exceed 
the  1  percent  level.  This  tends  to  show  that  if  items  that  differentiate 
between  bomber  and  fighter  pilots  were  keyed,  a  prediction  could  be  made 
that  would  help  to  place  pilots  properly  in  their  specialties.  The  dis¬ 
tribution  of  the  phi  coefficients  is  shown  in  table  26.31. 


Table  26.31.—  Distribution  of  phi  coefficients  for  Satisfaction  Test,  CE4C9B,  based 
on  a  sample  of  148  pilots  in  transition  and  advanced  training? 


Phi* 

Frequency 

Phi* 

Frequency 

0  30-0  34  a  r  .  1  .  r  t  t . 

1 

0.10-0.14  . 

14 

6  25-0  29  . 1 1  .  v  i  t  r . 

1 

0.05-0.09  . 

17 

0. 20-0.24  . . 

5 

0.00-0  04  . 

23 

0.15-0.19  . . . 

7 

Total  . 

70 

1  -  ,  - Tn  -  ■■  f 

•  In  cUm  44 B.  Tested  while  in  basic  training  by  pcrsooscl  •<  Psychological  Research  UM 

*' Siiice  the  test  i*  •  two-choice  tons,  only  positive  phi*  ere  allied. 


Tables  26.32  and  26.33  present  means,  standard  deviations,  and  critical 
ratios  of  the  differences  in  the  mean  scores  on  the  Satisfaction  test  for 
high  and  low  bomber  pilots  and  high  and  low  fighter  pilots.  For  the 
purpose  of  obtaining  the  high  and  low  groups,  the  Pilot  Proficiency 


*  Ratio;*  of  "above 
instructors  and  fligl . 
training. 


average."  “average,"  and  “below  average"  were  give*  *•  piWW 
commanders  on  the  student's  pilot  probciency  card*  at  each  phase 
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Cards  were  consulted.  The  ratings  bomber  pilots  received  in  general 
pilot  ability  in  advanced  and  transitional  training  were  summed,  and 
two  groups  were  formed  those  at  or  above  the  mean  and  those  below 
the  mean.  The  same  was  done  for  fighter  pilot  ratings  in  basic  and  ad¬ 
vanced  training.  The  four  groups  then  were  scored  with  the  f*nal  empirical 
key  discussed  on  page  26.46. 

Table  26.32. —  Means  and  standard  deviations  of  high  and  low  bomber  and  fighter 
pilots *  on  Satisfaction  Test,  CE409B 


Group 

Score 

N 

M 

SD 

213 

213 

213 

19.9 

15.8 

5.2 

4.7 

R  — \\r-f20 . 

24.1 

s.r 

ito 

180 

180 

22.3 

5.7 

13.6 

4  7 

R-W  +  20 . 

28.7 

9.4 

144 

25.1 

5.2 

144 

10.6 

3.5 

r_VV+20 . . . 

144 

34.4 

7.3 

164 

25.2 

4.9 

3.7 

164 

10.9 

R-W  +  20 . 

164 

34.3 

7.3 

'In  cUm  44B.  Te»i: .  in  basic  training  by  personnel  of  Psychological  Research  Unit  No.  3. 

Table  26.33.— Critical  ratios  of  differences  of  means  of  "high"  and  "loraf' 
bomber  and  fighter  pilots  on  Satisfaction  Test,  CE409B 


Group* 


“High"  bomber  »,  “low**  bomber 
“High"  fighter  v.  ‘low”  fighter  .. 
“High"  bomber  *.  “high"  fighter 
"Low"  bomber  tr.  “low"  fighter  .. 


Score 


Rights  .  •  ■ 
Wrongs  . . 
R— W  +  20 
Rights 


Wrongs  . 

R— W  +  20 . 

Rights  . 

Wrongs  ................. 

R  — W  +  20 . 

Rights  . . 

Wroni 


—W+20* 


CR 


*4.4 

*4.7 

*5.0 

.0 

.7 

.1 

*4.3 

*120 

*12.1 

*5.0 

*5.9 

*6.J 


*  Significant  at  the  1  percent  level. 

Evaluation. — Forms  B  and  C  of  the  Satisfaction  test  proved  to  be  very 
useful  testing  instruments.  Compared  with  form  A,  they  are  shorter 
in  length,  take  half  the  time,  and  arc  easier  to  administer.  On  three  dif¬ 
ferent  samples,  the  test  maintained  a  moderately  high  validity  coefficient 
with  the  pilot  criterion.  There  is  also  reason  to  believe  that  this  inventory 
measures  some  factor  not  adequately  measured  by  the  classification  bat¬ 
tery.  Owing  to  its  low  correlation  with  the  staninc  and  its  moderate 
validity  for  pilot  criterion,  this  test  would  be  a  valuable  addition  to  the 
pilot-selection  battery. 

It  is  interesting  to  note  the  nature  of  the  items  that  show  consistent 
validity  for  pilot  selection.  Examination  of  the  keyed  items  suggests 
definite  clustering  around  the  picture  of  the  eager  fighter  pitot,  who  likes 
to  fly  for  the  s:.!ce  of  flying,  for  the  excitement,  and  for  personal  ad¬ 
venture.  This  interest  and  attitude  pattern  appears  to  be  related  to  success 
in  training,  at  least  at  the  primary  level 
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Form  B  seems  promising  as  a  vehicle  to  determine  in  which  specialty 
(bomber  v.  fighter)  a  pilot  should  go.  As  in  the  aviation  preference 
check  list,  the  statistical  analysis  indicates  that  there  are  personality 
tn.its  which  differentiate  successful  bomber  pilots  from  successful  fighter 
pilots. 

There  are  12  responses  (one-third  of  all  keyed  items)  in  form  B  that 
are  weighted  positively  for  the  primary  pilot-validity  key,  positively  for 
the  fighter-pilot  key,  and  negatively  for  the  bomber-pilot  key.  There  are 
five  responses  which  are  positively  weighted  both  for  students  in  primary 
pilot  training  and  for  bomber  pilots  in  transitional  training.  The  other 
19  items  weighted  tor  primary  pilot  are  weighted  zero  for  pilot-speciali¬ 
zation  criteria.  These  data  tend  to  show  that  primary-training  motivation 
!«  more  akin  to  fighter-pilot  motivation  than  to  bomber-pilot  motivation, 
and  this  is  corroborated  by  the  data  in  tables  26.32  and  26.33. 

As  has  previously  been  explained,  form  C  differs  from  form  B  only  in 
that  it  has  a  third  alternative — “Neither  of  these.”  On  the  basis  of  the 
Kudcr-Richardson  formula,  form  C  appears  to  provide  a  somewhat  more 
reliable  test  (0.8  as  against  0.7  for  form  B).  Because  ihis  slightly  higher 
reliability  is  not  accompanied  by  a  correspondingly  higher  validity,  it  is 
felt  that  the  evidence  docs  not  wai  rant  using  the  third  response  in  con¬ 
structing  a  new  form. 

Variation  of  the  teat  (1)  Satisfaction  Test,  .CE409D.1* — Form  1/  is 
the  culmination  of  all  the  work  done  on  the  Satisfaction  test  and  the  Avia¬ 
tion  Preference  Check  List.  Because  the  two  tests  were  quite  similar,  it 
was  possible  to  choose  the  most  valid  items  from  a  combined  total  of  235 
items.  The  60  items  of  highest  validity  were  selected.  The  correlation  of 
these  items  with  the  pass-fail  criterion  in  primary  pilot  training  is  indi¬ 
cated  by  an  absolute  mean  phi  of  0.11  and  a  range  from  0.07  to  0.19. 
Twenty  additional  items  were  added  to  these  60  for  padding  to  confuse 
the  examinee  as  to  the  purpose  of  the  test  and  so  to  keep  him  from  ration¬ 
alizing  wl.’ch  are  the  right  answers. 

Form  D  is  divided  into  two  parts  of  40  items  each.  The  items  of  the 
two  parts  are  matched  for  validity,  content,  and  percentage  answering  the 
item. 

The  total  testing  and  administration  time  for  form  D  is  16  minutes, 
with  parts  I  and  II  timed  separately. 

For  the  form  D  key,  the  60  alternatives  with  previous  positive  validity 
were  given  weights  of  +1,  the  60  alternatives  with  previous  negative 
validity  were  given  weights  of  —  1,  :-nd  0  weights  were  given  to  the  20 
padding  items.  The  scoring  formula  is  R— W+20,  the  constant  being 
added  to  eliminate  negative  scores. 

Form  D  is  shorter  and  takes  less  rdministration  and  testing  time  than 
the  previous  forms.  Because  of  the  ratched  parts,  it  is  expected  to  give 
a  better  estimate  of  reliability.  Cori>spondingly  higher  validity  is  also 

»  DrrcWpcd  it  Swank  Uth  Km  & 
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expected  because  c.  d  elusion  only  of  items  with  high  validity 
coefficients. 

No  statistical  data  arc  available. 

Aviation  Preference  Check  List  (no  code) 

This  check  list  is  a  revision  of  the  Navy  Aviation  Preference  Check 
List.  It  was  hoped  that  differences  would  be  found  in  the  likes  and  dis¬ 
likes  of  successful  and  unsuccessful  pilots.  If  such  an  instrument  proved 
to  be  successful  for  selecting  pilots,  similar  instruments  might  also 
assess  the  motivation  for  bombardier  and  navigator  training. 

Description. — The  preference  list  is  composed  of  descriptions  of  ac¬ 
tivities  in  the  fields  of  sports,  hobbies,  and  military  flight  assignments. 

The  activities  are  paired,  and  the  examinee  is  required  to  choose  the 
activity  he  prefers.  The  two  alternatives  represent  different  types  of 
participation  in  the  same  field  of  activity.  The  alternatives  are  worded 
to  eliminate  bias  as  much  as  possible,  since  analysis  has  shown  that  certain 
adjectives  or  adverbs  tend  to  influence  responses  to  items. 

(1)  Internal  characteristics. — In  the  final  form  of  the  test  there  are  i 

150  items.  A  typical  item  is:  “(A)  Fly  with  others  in  the  ship,  (B)  Fly 
solo."  '  I 

(2)  Administration. — The  examinee  is  instructed  to  answer  items  j 

quickly  according  to  his  first  impression,  avoiding  deliberation.  He  is  | 

required  to  answer  all  questions.  No  time  limit  is  set,  but  20  minutes  arc  l 
sufficient  time  for  almost  everyone  to  finish.  The  test  is  paced  by  the 
administrator.  After  6  minutes,  the  examinees  are  told  that  they  should 

be  on  item  50.  After  12  minutes,  they  are  told  that  they  should  be  on  ; 
item  100.  .  , 

(3)  Scoring. — The  scoring  formula  is  R— W+40,  in  which  R  repre¬ 

sents  responses  wi?h  positive  weights  and  W  responses  with  negative 
weights.  { 

Statistical  results. — (1)  Item  validity.  After  dividing  a  sample  of  ] 
1,459  answer  sheets  completed  by  examinees  at  Psychological  Research  | 
Unit  No.  3  into  two  random  halves,  the  responses  to  the  items  in  the  i 
sample  were  correlated  with  a  modified  graduation-elimination  criterion, 
with  students  rated  "below  average"  in  any  phase  of  training*  placed  , 

with  the  eliminccs.  Graduates  had  passed  at  all  levels  of  training,  and 
they  include  both  bomber  and  fighter  pilots.  The  distributions  of  the  phi 
coefficients  are  shown  in  table  26.34.  Items  arc  not  included  if  one  re¬ 
sponse  is  chosen  by  more  than  90  percent  or  less  than  10  percent  of  the 
examinees. 

In  interpreting  these  phi  coefficients,  it  can  be  said  that  for  an  N  of 
730,  a  phi  of  0.07  is  significant  at  approximately  the  5  percent  level, 
and  a  phi  of  0.09  is  significant  at  the  1  percent  level  of  confidence.  In 
the  odds  sample,  23  items  exceed  the  5  percent  level  of  significance,  of 
which  16  exceed  the  1  percent  level.  In  the  evens  sample,  die  correspond-  1 
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Table  26  34. —  Distribution  of  phi  coefficients  for  Aviation  Preference  Ch<>.»  Lit* 
based  on  pilots  at  all  levels  of  training,  using  a  modified  graduation-elimination 

criterion 


Phi* 

I 

Odd* 

Even* 

O.lS-O.19  . . . 

a 

0 

0.19-0.14  . 

10 

4 

0.05-0.09  . 

44 

43 

0.00-0.04  . 

73 

79 

Total  . 

121 

12S 

*  Since  the  test  l«  a  two-choice  form,  only  positive  pfcla  are  tallied. 
•AT, =729,  pt=0-7?.  In  dearer  44F  and  44G. 

•  W,=730,  y#=0.77.  In  clasrei  44F  and  44G. 


ing  figures  are  21  and  6.  There  were  39  items  keyed  in  the  odds  sample 
and  27  items  keyed  in  the  evens  sample.11  Of  these,  only  eight  items 
were  keyed  the  same  on  both  samples.  Six  items  were  keyed  with  op¬ 
posite  signs,  and  45  items  were  keyed  zero  on  one  sample  and  positive 
or  negative  on  the  other.  The  discrepancy  in  the  two  keys  is  probably  due 
to  a  preponderance  of  bomber  pilots  in  the  evens  sample.  That  there  is 
such  a  preponderance  is  indicated  by  the  fact  that  the  same  items  that 
proved  valid,  in  the  study  reported  below  on  pilot  specialization,  for 
discriminating  between  fighter  and  bomber  pilots,  have  the  most  significant 
positive  phis  in  the  evens  sample  only. 

(2)  Use  of  Aviation  Preference  Check  List  for  pilot  specialization v 
An  item  analysis  was  completed  for  this  check  list,  using  as  the  criterion 
General  Pilot  Ability  •  ratings.  Fifty-two  bomber  pilots,  who  were  rated 
"•\bove  average”  in  general  pilot  ability  in  transition  schools,  and  97 
fighter  pilots,  who  were  rated  "above  average”  in  single-engine  advanced 
schools,  were  used.  The  distributions  of  the  phi  coefficients  are  shown 


in  table  26.35. 

Table  26J5.—  Distribution  of  phi  coefficients  for  Aviation  Preference  Check  List 
based  on  a  sample  of  52  bomber  pilots  in  transition  training  and  97  fighter  pitots 

in  advanced  training1 


Phi* 

Frequency 

Phi* 

Frequency 

3 

0.09-0.11  . 

40 

|7 

0.04-0.08  . . 

31 

A  lOJl  31  . . 

14 

0.00-0.03  . 

41 

0  1 4-0.1$  . 

17 

1  Tots!  . 

147 

-an  ciau  *  vaicu  »»»  wv  » >  - . .  r---- - .  ,  '  . 

*  Since  the  test  is  a  two-choice  form,  only  positive  phis  art  UUn 


OfIC 

ltd, 


In  interpreting  these  phi  coefficients,  it  can  be  said  that  for  an  N 
of  150,  a  phi  of  0.15  is  significant  at  the  5  percent  level,  and  a  phi  of 
0.21  is  significant  at  the  1  percent  level  of  confidence.  In  this  sample, 
44  phis  exceed  the  5  percent  level  of  significance  and  27  exceed  the 
1  percent  level.  This  would  tend  to  show  that  if  items  that  differentiate 
between  bomber  and  fighter  pilots  were  keyed,  a  prediction  could  be 
made  that  would  help  to  place  pilots  properly  in  their  specialties. 


ii  s«m«  itemt  were  ktjred  that  did  net  quit*  reach  the  S  pjrtwl  ktd  •(  eltfiileenef,  II  othof 
data  showed  that  the  itetnr  could  reu«oably  be  expected  to  be  valid. 
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Tables  26.36  and  26.37  present  means,  standard  deviations,  and  critical 
ratios  of  the  differences  in  mean  scores  on  the  Aviation  Preference 
Check  List  for  “high"  and  “low"  bomber  pilots  and  “high”  and  "low*' 
fighter  pilots.  For  the  purpose  of  obtaining  the  “high"  and  “low"  groups, 
the  Pilot  Proficiency  Cards  were  consulted.  The  ratings  bomber  pilots 
received  in  General  Pilot  Ability  in  advanced  and  transitional  training 
were  summed,  and  two  groups  were  formed — those  at  or  above  the  mean 
and  those  below  the  mean.  The  same  was  done  for  fighter-pilot  ratings 
in  basic  and  advanced  training.  The  four  groups  then  were  scored  with 
the  empirical  key,  valid  for  the  primary  graduation-elimination  criterion. 


Table  26.36. —  Means  and  standard  deviations  of  high  and  low  bomber  and  fighter 
pilots  on  Aviation  Preference  Check  List 


U 

SD 

213 

18.9 

4J 

213 

21.2 

4.9 

213 

38.0 

9 Jt 

ISO 

22.6 

6.9 

180 

17.9 

6.1 

180 

4S.0 

11.9 

144 

26.4 

4.5 

144 

13.6 

4.2 

144 

53.0 

9 J 

164 

26.3 

4 A 

164 

14.1 

4.4 

164 

524 

9J 

High  bomber) 
Low  bomber*  . 
High  fighter*  . 
Low  fighter*  . 


Score 


Right*  . . . 
Wrong*  . . 
R-W+40 
Rights  ... 
Wrong*  . . 
R — W+40 
Rights  ... 
Wrong*  . . 
R  —  w+40 
Right*  ... 
Wrongs  . . 
R— W+40 


N 


■In  das*  44 B.  'Tested  in  basic  training  iy  personnel  of  Psychological  Research  Unit  No.  X 


Table  26.37. —  Critical  ratios  of  differences  of  means  of  high  and  low  bomber  end  i 


fighter  pilots  on  AviC’on  Preference  Check  List 


Group* 

Score 

1 

at 

i 

Right*  . . ......... 

*6.4  j 

*6.9 

*6.3 

4 

1  JO 

J9 

*15.0 

*15.7 

*154 

■4.4  < 

*M 
*6.2 

R— . . 

r— W+40 . . . 

R  — \V+40 . 

R—  W+40  . 

•  Significant  »t  or  beyond  the  I  percent  lcreL 


Evaluation. — The  Aviation  Preference  Check  List  seems  promising  In 
connection  with  pilot  specialization.  Statistical  analysis  of  the  items  sug¬ 
gests  that  there  arc  personality  traits  which  differentiate  successful  bomber 
pilots  from  successful  fighter  pilots.  A  definite  preference  pattern  appears, 
indicating  that  the  fighter  pilot  likes  to  fly  for  the  sake  of  flying,  for  the 
excitement,  fer  the  personal  adventure,  and  that  he  has  a  devil-may-care 
attitude.  Bomber  pilots  appear  to  be  more  conscientious,  methodical, 
thorough,  and  r>  ring,  and  must  be  willing  to  accept  responsibility 
for  men  and  ev 

Some  exampu's  ot  items  that  differentiate  enough  to  support  these 
descriptions  of  the  two  types  of  pilots  are  given  below.  Asterisks  indicate 
responses  positively  weighted  for  bomber  pilots: 


•A.  Fly  with  others  in  the  ship. 

B.  Fly  solo. 

A.  Work  involving  few  details. 

*B.  Work  involving  many  details. 

A.  Aerobatics. 

*B.  Instrument  flying. 

•A.  Read  a  bode 

B.  Read  a  magazine. 

A.  Repair  a  motor. 

•B.  Tell  others  how  to. 

A.  Strafe  hostile  infantry. 

*B.  Bomb  hostile  fort 

•A.  Fight  from  formation. 

B.  Fight  individual  dogfights. 

•A.  Ground  School  Instructor. 

B.  Physical  Training  Instructor. 

i 

Inventory  of  Experiences,  Interests,  and  Attitudes,  CE612AX211 

This  is  another  inventory  designed  to  assess  personal  background  and 
preferences. 

Description . — The  inventory  consists  of  a  number  of  questions  con¬ 
cerning  the  examinee's  past  experiences  and  activities,  his  interests  and 
preferences,  his  feelings  and  attitudes.  The  majority  of  the  items  are  of 
the  multiple-choice  type.  Some  of  the  activities,  however,  are  paired, 
and  the  examinee  is  required  to  choose  one  of  the  turo  activities  of  each 
pair.  These  paired  items  are  quite  similar  to  the  items  in  the  Aviation 
Preference  Check  List.  In  the  third  section  of  the  inventory  different  jobs 
are  described,  and  the  examinee  is  asked  to  rate  the  job  on  a  scale  rang¬ 
ing  from  "very  much  more  attractive”  to  "very  much  less  attractive.**  The 
last  ten  questions  consist  of  a  list  of  10  airplanes.  The  examinee  is  asked 
to  express  on  a  five-point  scale  how  he  would  feel  about  being  assigned 
to  training  on  each  of  the  planes.  Some  of  these  last  questions  were  used 
as  the  criterion  for  the  rest  of  the  test 

(1)  Internal  characteristics. — There  are  150  items,  divided  into  4 
parts:  part  I  contains  82  items;  part  II,  40  items;  part  III,  20  items; 
and  part  IV,  10  items.  Typical  items  of  part  I  are  given  below.  The  plus 
and  minus  sign*  indicate  weights  of  +1  and  —I  derived  from  item- 
validation  procedures  to  be  described  bekm. 

How  often  do  you  write  your  parent  or  parents! 

0  A.  Parents  not  living. 

— B.  Almost  every  day. 

— C  About  twice  a  week 

+  D.  About  once  a  week  * 

+E.  Levs  frequently  than  once  a  week 

Do  you  spend  a  good  deal  of  time  planning  what  you  wish  to  do  after  the  war? 

+A.  Yes. 

-a  Na 


*DctiM  *(  hUtnmn 
a  Cktti  cMViWun:  M4  k 


AAF  TnMmg  O 
U  TWbAJm  m4 


Mi*.  S. 


Oak  X*. 
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Two  typical  items  of  part  II  follow.  The  examinee  is  required  to 
choose  one  of  the  alternatives. 

~A.  Do  stunt  flying. 

+  B.  Do  straight  flying. 

“A.  Acrobatics. 

+  B.  Instrument  flight. 

In  part  III,  the  examinee  rates  jobs  on  a  five-point  scale:  (A)  very 
much  more  attractive,  (B)  more  attractive,  (C)  neither  more  nor  less 
attractive,  (L>)  less  attractive,  and  (E)  very  much  less  attractive. 

Two  typical  job-descriptions  in  this  part  are: 

In  this  Job  the  pay  consists  mostly  of  commissions  on  sales.  (B  is  keyed  minus 
and  D  pha) 

This  job  etfl*  for  a'  high  degree  of  skfll  in  athletics.  A  is  keyed  minis  and  B  plus. 

In  part  IV,  the  examinee  indicates  his  job  satisfaction,  using  a  five- 
point  scale:  (A)  would  be  extremely  well  satisfied  with  this  assignment; 
(B)  would  be  well  satisfied  with  this  assignment;  (C)  would  be  moder¬ 
ately  well  satisfied  with  this  assignment;  (D)  would  prefer  a  different 
assignment ;  and  (E)  would  very  much  prefer  a  different  assignment 

Using  this  scale,  he  expresses  his  satisfaction  with  an  assignment  to 
training  in  the  following  planes :  B-17,  B-24,  B-29,  B-25,  B-26,  A-20, 
P-38*  P-49,  P-47,  and  P-5 1. 

(2)  Administration. — While  there  are  no  time  limits  for  this  test, 
experience  has  shown  that  all  should  be  finished  at  the  end  of  35  minutes. 
It  has  been  found  advisable  to  pace  individuals  at  the  end  of  15  minutes 
by  saying,  "You  should  be  on  item  No.  70." 

(3)  Scoring. — To  secure  a  criterion  for  the  development  of  an  empiri¬ 
cal  key,  answer  sheets  of  bomber-pilots  in  transitional  training  were 
scored  on  the  questions  concerning  the  B-17,  the  B-24,  and  the  B-29. 
This  provided  a  possible  range  of  scores  from  3  through  15,  since  an 
A  response  was  scored  as  five  points,  B  as  four,  C  as  three,  D  as  two, 
and  E  as  one.  Answer  sheets  of  fighter-pilots  in  advanced  training  were 
scored  on  the  questions  concerning  the  P-38,  the  P-40,  the  P-47,  and 
the  P-51,  thus  providing  a  possible  range  of  scores  from  4  to  20.  These 
were  the  criterion  scores  which  were  taken  to  represent  degrees  of 
satisfaction. 

Approximately  1,800  bomber-pilots  and  1,600  fighter-pilots  had  taken 
the  inventory.  They  were  tested  in  transitional  and  advanced  training, 
respectively,  in  both  Western  and  Central  Flying  Training  Commands, 
by  personnel  of  Psychological  Research  Unit  No.  2.  Each  set  of  papers 
was  divided  into  two  random  halves,  and  item  analyses  were  accom¬ 
plished.  On  the  basis  of  inese  item  analyses,  two  key's  were  constructed  in 
a  manner  described  below. 

The  scoring  formula  is  R— W+20,  in  which  R  represents  responses 
with  positive  weights  and  W  responses  with  negative  weights. 
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Statistical  results. — As  described  alxnc,  this  test  was  administered 
to  pilots  in  advanced  and  transitional  training,  after  they  had  had  ex¬ 
perience  in  flying  specialized  planes.  The  test  scores  (criterion  scores 
described  above)  for  the  inventory  were  reduced  to  a  9*point  scale,  with 
an  attempt  to  obtain  a  mean  of  5  and  a  standard  deviation  of  2. 

Item  analysis. — The  papers  were  split  at  rardoni  into  four  groups: 
bomber-pilot  odds  and  evens  samples,  and  fighter-pilot  odds  and  evens 
samples.  Using  the  highest  and  lowest  25  percent  of  the  subsamples,  on 
the  basis  of  the  criterion  scores,  6  item  analyses  were  accomplished. 
The  six  item  analyses  contrasted  high  v.  low  fighter-pilot  groups,  high  v. 
low  bomber-pilot  groups,  and  high  bomber  v.  high  fighter-pilot  groups, 
for  the  odds  and  evens  samples  separately. 

Two  criteria  were  used  for  the  inclusion  of  a  response  in  a  key:  the 
response  had  to  differentiate  the  high -bomber- pilot  group  from  the  high 
fighter-pilot  group  with  &  phi  of  0.15  or  better  (beyond  the  1  percent 
level  of  confidence),  and  it  had  to  differentiate  either  the  high-fightcr- 
pilot  group  from  the  low-fighter-pilot  group  or  the  high-bombtr-pilot 
group  from  the  low-bomber-pilot  group  with  a  phi  of  0.10  or  better  (be¬ 
yond  the  5  percent  level  of  confidence).  All  phis  were  cbmputcd  from  per¬ 
centages  based  on  the  total  number  answering  a  question,  rather  than 
the  total  number  taking  the  test 

The  test  was  keyed  for  bomber-pilot  satisfaction.  In  ihe  final  odds 
keys,  31  responses  were  keyed  positively,  and  32  responses  were  keyed 
negatively.  In  the  final  evens  key,  41  responses  were  keyed  positively, 
and  48  were  keyed  negatively.  No  responses  were  keyed  negatively  in 
one  sample  that  were  also  keyed  positively  in  the  other,  or  vice  versa. 
The  number  of  significant  responses  at  the  1  and  5  percent  levels  are 
presented  in  table  26.38. 


Tabu  26.38.—  Number  of  significant  responses  foe  Inventory  of  Experiences, 
Interests,  and  Attitudes,  CE612AX2,  based  on  pilots  tested  in  advanced  and 

transition  training 


Analysis 

Sample 

K 

Number  ot 
significant  ittfout* 

S  percent 

1  percent 

1  tilth  honker  pilot*  ».  W*  komWr  pilot* 

044*  . 

MS0 

13* 

M 

Erras  . 

MS0 

1 44 

n 

Hi*h  fighter  pilot*  o.  low  fighter  pilot*  ... 

Oddi  . 

*400 

■2 

s* 

Error  . 

**09 

10* 

»* 

Hijjh  komker  pilot*  w.  high  fighttt  pilot*  . 

Odds . 

**00 

IIS 

u 

Etcm  . 

*400 

n* 

u 
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(3)  Test  vai.Jily. — Cross-validation  results,  based  on  pilots  tested 
in  advanced  and  transition  training  with  the  satisfaction  score  as  the 
criterion,  arc  given  in  table  26.39. 
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Table  26.39. —  nroduct-momen(  Correlations  of  Inventory  of  Experiences ,  Attitudes ; 
and  Inte-ests  CF.612AX2,  with  satisfaction  scores  as  the  criteria,  based  on  pilots 
,ested  in  advanced  and  transition  training 


Group 


Sample 

Key 

N 

901 

Evens  ............. 

Odds  . 

916 

Odd*  . . . . 

797 

Odd*  . 

843 

Bomber  riiot' 

Do  . . 

Fijfht-.T  pilot* 


0.38 

.4? 

‘-.34 

*-.30 


faction,  the**  negative  correlation*  are  in  the  expected  direction. 


Evaluation. — This  test  appears  to  have  a  quite  satisfactory  validity. 
The  cross-validation  is  more  a  test  of  internal  consistency,  however, 
than  it  is  a  genuine  test  of  validity.  Since  the  requirements  used  in  mak¬ 
ing  up  the  scoring  keys  were  stringent,  and  since  both  keys  were  valid, 
the  indications  are  that  the  instrument  might  be  of  distinct  value  in 
assigning  pilots  to  specialized  training.  Further  evidence  supporting  this 
conclusion  is  the  fact  that  some  of  the  items  in  part  II  that  have  high 
positive  phis  are  items  appearing  in  the  Aviation  Preference  Check  List 
which  significantly  discriminate  between  above-average  bomber  pilots  in 
transition  and  above-average  fighter  pilots  in  singlc-engihe  advanced 
training. 

It  must  be  realized  that  the  data  obtained  for  this  inventory  were  based 
upon  pilots  who  were  in  ansition  and  advanced  schools.  This  is  a 
highly  selected  group,  in  that  it  does  not  include  those  pilots  who  were 
eliminated  at  the  primary  or  basic  level.  In  addition,  pilots  in  transition 
and  advanced  schools,  for  the  most  part,  have  already  decided  on  the 
basis  of  actual  experience  with  one  or  the  other  type  of  training  which 
type  of  training  they  prefer,  further  biasing  the  group.  It  is  possible 
that  similarly  good  results  might  not  be  obtained  if  the  inventory  were 
administered  at  a  much  earlier  stage,  i.  e„  during  the  classification  period, 
or  at  primary  school  and  then  validated  on  a  later-obtained  criterion. 


Specialization  Preference  Inventory,  CE610A  11 

This  inventory  was  constructed  on  the  assumption'that  preferences  for 
certain  activities  and  likes  and  dislikes  would  differentiate  potentially 
successful  fighter  and  bomber  pilots. 

Description. — An  attempt  was  made  in  this  inventory  to  pair  system¬ 
atically  some  of  the  alternatives  of  every  category  with  an  equal  number 
of  alternatives  from  other  categories.  These  categories  arc:  intellectual  v. 
nonintcllectual,  mechanical  v.  nonmcchanical,  routine  v.  nonroutine,  social 
v.  nonsocial,  responsibility  v.  nonresponsibility,  social  values  v.  economic 
values,  etc.  This  inventory  differs  from  previous  ones  in  that  military 
activities  were  matched  extensively  with  civilian  activities. 

(1)  Internal  charoctcrstics. — The  inventory  contains  122  items.  Two 
typical  items  are: 

**  Developed  it  P»jrcHologic*l  Revirtb  Unit  No.  3.  Chief  contributor*:  Lt.  John  I,  Lacey 
tnf  U.  Elf  A.  Lip-nin. 
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A.  Read  a  book  on  gunnery. 

B.  Read  a  book  on  propaganda  methods. 

A.  Building  wooden  cabinets. 

B.  Interview  job  applicants. 

(2)  Administration. — The  examinee  is  instructed  to  answer  each  item 
by  indicating  the  one  activity  of  two  presented  that  he  prefers.  He  is 
further  instructed  to  answer  quickly,  according  to  his  first  impression, 
avoiding  deliberation. 

The  time  limits  are  as  follows :  administration  time,  2  minutes ;  testing 
time,  25  minutes.  The  test  is  paced  by  announcing  the  number  of  the 
question  the  examinee  should  have  finished  at  certain  times;  namely, 
item  30  when  a  quarter  of  the  time  has  elapsed;  item  60  when  one-h.'.lf 
of  the  time  has  elapsed,  and  item  90  when  three-quarters  of  the  time 
have  elapsed. 

(3)  Scoring. — An  empirical  scoring  key  based  on  the  results  of  an 
internal-consistency  item  analysis  was  constructed.  This  key  has  71 
positively  (R)  and  negatively  weighted  (W)  items.  The  scoring  formula 
is  R~W+ 40,. 

Statistical  results. — The  data  on  this  test  are  restricted  to  a  sample 
of  pilots  tested  in  basic  flying  training,  in  classes  44H  and  441,  by  person¬ 
nel  of  Psychological  Research  Unit  No.  3. 

(1)  Distribution  statistics. — Based  on  the  final  empirical  “:ey,  a  sample 
of  724  classified  pilots  yielded  a  mean  score  of  32.7  and  a  standard 
deviation  of  16.9. 

(2)  Internal  consistency. — An  a  priori  bomber-pilot  key  was  used  to 
score  the  answer  sheets  of  750  classified  pilots  (sample  I).  This  key 
represented  the  consensus  of  12  aviation  psychologists.  No  item  was 
accepted  unless  at  least  10  of  the  12  judges  agreed.  With  the  total  score 
as  the  criterion,  an  internal-consistency  item  analysis  was  completed. 
The  homogeneity  of  the  items  is  indicated  by  a  mean  pui  of  0.27,  a 
standard  deviation  of  0.13,  and  a  range  of  valucs  from  0.00  to  0.53. 
These  statistics  arc  based  upon  analysis  of  the  highest  27  percent  and 
the  lowest  27  percent  in  total  score. 

A  new  key  was  derived  from  this  analysis.  The  answer  sheets  of  3 
new  sample  (II)  of  750  cases  and  of  the  previous  sample  (I)  were  then 
scored,  using  the  new  key.  The  data  in  table  26.40  arc  based  on  analysis 
of  the  responses  of  the  highest  27  percent  and  the  lowest  27  percent 
in  the  new  total  score. 


Table  26.40.—  Internal-consistency  data  {of  122  items  of  Specialisation  Preference 


Sample 

M# 

SDt 

I  . 

0.31 

0.11 

t!  . . 

.30 

.11 

Un|(  of  4 


Low 


0.07 

.06 


Uish 


0.58 

.SI 
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(3)  Reliability  coefficient. — by  the  odd-even  method,  an  estimated 
reliability  coefficient  of  0.78,  corrected  for  length,  was  obtained.  This 
figure  is  based  on  a  sample  of  724  classified  pilots. 

(4)  Intercorrelations. — Since  a  test  of  this  sort  is  very  likely  to 
measure  verbal  and  other  intellectual  abilities,  a  correlation  between  this 
test  and  Reading  Comprehension,  CI61IH,  was  secured.  Rased  on  S07 
classified  pilots,  the  correlation  is  only  0.01,  corrected  for  attenuation 
in  both  variables.  The  test,  therefore,  obviously  has  no  verbal  variance. 

Correlations  with  navigator  and  pilot  stanines  arc  only  0.05  and 
—0.13,  based  on  classified  pilots  numbering  505  and  634  respectively. 

Evaluation. — This  inventory  has  three  advantages  ov  f  r  previous  in¬ 
ventories.  It  contains  a  more  systematic  pairing  and  a  wider  coverage  of 
activities.  The  items  are  so  presented  that  the  purpose  of  the  inventory 
is  less  obvious  to  the  examinee. 

A  review  of  the  items  that  are  highly  consistent  with  the  total  score 
in  both  analyses  reveals  patterns  of  interest  similar  to  those  empirically 
evolved  from  the  Aviation  Preference  Check  List  and  the  Satisfaction 
Test.  In  table  26.41,  a  sample  is  given  of  these  items,  together  with  anal¬ 
ogous  items  that  were  valid  in  the  Aviation  Preference  Check  List  and 
Satisfaction  Test,  CE409B.  This  points  strongly  to  the  conclusion  that 
Specialization  Preference  Inventory  could  serve  as  a  useful  instrument  in 
assorting  pilots  for  specialized  training. 

Its  relatively  high  internal  consistency,  unusual  for  a  test  of  this 
nature,  its  low  correlations  with  pilot  an  !  navigator  stanines,  and  its  lack 
of  correlation  with  a  reading-comprehension  test,  all  indicate  the  potential 
usefulness  of  the  inventory. 


Table  26,41. —  Comparison  of  some  items  in  Specialisation  Preference  Inventory, 
C  P.610  A,  Aviation  Preference  Check  List,  and  Satisfaction  Tert,  CE409B 1 


Specialization  preference 
Inventory  item* 


Aviation  preference  cheek  list 
or  satisfaction  lest  items 


7.  A. 

•B. 

19.  A. 

*B. 
54.  *A. 
B. 

70.  *A. 
B. 


Pei  form  some  types  of  technical 
work  on  a  construction  project 

Have  complete  administrative  re¬ 
sponsibility  for  a  construction 
job. 

Perform  acrobatics  in  a  basic 
training  plane. 

Complete  an  instrument  Sight  in 
an  advanced  training  border. 

Direct  and  supervise  ground-school 
curriculum. 

Be  in  charge  of  cadet  athletic  ac¬ 
tivities. 

Use  maps  and  landmirka  to  plot 
position. 

Be  the  tail  gunner  on  a  B-17. 


76.  *A.  Be  on  hone:  student  in  a  college 
class. 

B.  Win  a  varsity  letter. 

114.  *A.  Adjust  mental  difficulties  of  people. 
B.  Operate  a  steam  locomotive. 


53.  A.  Repair  a  motor. 

*B.  Tell  others  how  to. 


10.  A.  Acrobatics. 

*B.  Instrument  flight. 


101.  *A.  C round  school  Instructor. 
B.  Physical  training  instructor. 


As  an  enlisted  member  of  t  combat  crew,  I 
would  get  more  satisfaction  from  being  a 
18.  *A.  Radio  operator. 

B.  Tail  gunner. 

25.  *A.  Flight  engineer. 

B,  Tail  gunner. 

30  *A.  Be  a  famous  artist! 

B.  Be  a  famous  football  star! 

81.  *A.  Have  control  of  other  people. 

B.  Have  control  of  a  machine  (like 
a  plane). 


•  Asterisks  indicate  items  positively  weighted  for  bomber  pilots. 


Specialization  Interest  Inventory,  CE609A  u 

The  rationale  for  this  test  is  the  same  as  for  Specialization  Preference 
Inventory,  CE610A. 

Description. — This  inventory  is  patterned  after  the  Strong  Vocational 
Interest  Blank.  As  such,  it  covers  an  even  wider  range  of  interests,  both 
occupational  and  scholastic,  than  the  Specialization  Preference  Inventory, 
CE610A. 

(1)  Internal  characteristics. — There  arc  250  items  divided  into  three 
parts.  Part  I  contains  !85  occupations  listed  alphabetically,  such  as  audi¬ 
tor,  advertiser,  architect,  astronomer,  etc.  The  examinee  must  respond 
to  each  item  in  one  of  three  ways — like,  indifferent,  or  dislike.  Part  II 
contains  35  items  listing  school  subjects,  such  as,  foreign  language,  social 
science,  philosophy,  etc.  Again  the  examinee  must  respond  to  each  with 
like,  indifferent,  or  dislike.  Part  III  has  30  items  listing  sports  and  posi¬ 
tions  on  teams  (catcher,  pitcher,  etc.).  In  each  of  these  items  the  ex¬ 
aminee  selects  the  one  he  enjoys,  or  thinks  he  would  enjoy,  most  and 
the  one  he  enjoys,  or  thinks  he  would  enjoy,  least.  An  example  of  this 
type  of  item  follows : 

Gf  the  following  positions  in  football,  which  one  would  you  enjoy  most? 

Of  the  following  positions  in  football,  which  one  would  you  enjoy  least? 

A  End. 

B.  Guard. 

G  Center. 

D.  Quarter  or  half-back. 

E.  Fullback. 

(2)  Administration. — The  following  sentences  arc  excerpts  from  the 
directions : 

Indicate  after  each  occupation  listed  below  whether  you  would  like  that  kind  of 
work.  You  are  not  asked  if  you  would  take  up  the  occupation  permanently,  but 
whether  you  would  enjoy  the  kind  of  work,  regardless  of  any  necessary  skills,  abili¬ 
ties,  or  training  which  you  may  or  may  not  possess  •  *  • 

Work  rapidly.  Use  only  your  first  impressions  in  answering.  Answer  all  the  item*. 

The  t;.  me  limits  are:  testing  time,  28  minutes;  administration  time, 
2  minutes.  The  examinees  arc  paced  by  announcing  the  number  of  the 
question  the  examinee  should  have  finished,  i.  c.,  items  63,  126,  and  188 
at  the  end  of  7,  14  and  21  minutes,  respectively.  . 

(3)  Scoring. — No  key  is  available. 

Statistical  Results. — No  statistical  data  arc  available. 

Social  Concepts,  CE5I2A  ” 

This  test  was  constructed  for  the  purpose  of  determining  whether  or 
not  successful  and  unsuccesfful  air-crew  trainees  differ  in  their  social 
beliefs  and  attitudes. 

i<  Developed  *t  Psychological  lUsc  :h  Uni;  .le.  1.  Chief  coUrilwtors:  Cspi.  John  I.  U *rr 
snd  U.  Eli  A.  Upnun.  .  „  .  „ 

»  Developed  »t  Psyehologleal  Resesf*  Uoit  No.  I.  , 
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A  number  of  studies  of  fear  in  combat,  e.g.,  that  of  Dollard,  indicates 
that  proper  motivation  of  the  soldier  is  an  important  factor  in  overcom¬ 
ing  fear.  Proper  motivation  also  enables  the  soldier  to  adjust  to  new 
situations  and  to  cope  with  the  deprivations  involved  in  his  experience. 
As  belief  in  the  cause  for  which  one  is  fighting  was  known  to  be  an 
important  element  in  military  motivation,  a  test  of  social  concepts  and 
of  social  morale  was  constructed. 

Description. — This  test  may  be  viewed  either  as  a  measure  of  the 
individual's  understanding  of  or  his  attitudes  towards  social  problems. 
Since  both  views  can  be  assumed  to  have  some  bearing  on  motivation, 
it  was  not  considered  important  to  distinguish  between  measures  of 
understanding  and  measures  of  attitudes.  The  method  of  responding  to 
the  statements  (true  or  false,  acceptable  or  unacceptable)  docs  make  it 
possible,  however,  for  the  examinee  to  indicate  whether  he  believes  he  is 
dealing  with  facts  or  with  opinions. 

(1)  Internal  characteristics. — The  "Test  of  Social  Understanding* 
developed  by  the  Cooperative  Study  in  General  Education  (2)  was  ex¬ 
amined  and  its  contents  modified  in  order  to  make  the  items  suitable  for 
aviation  students.  One-hundred  forty-six  items  were  constructed.  A  few 
typical  items  are: 

A.  Everybody  has  an  equal  chance  in  America. 

B.  If  we  could  pass  the  right  laws,  we  could  solve  our  social  problems  once 

and  for  alL 

C  The  United  States  has  nothing  like  social  classes. 

D.  The  people  who  complain  about  an  unfair  press  are  free  to  start  a  paper 

of  their  own. 

E.  There  never  was  a  modem  war  that  wasn’t  started  by  the  bankers  and 

munitions-makera. 

P.  Discussing  social  issues  docs  not  help  to  solve  them. 

(2)  Administration. — The  following  arc  excerpts  from  the  directions 
to  the  Social  Concepts  Test : 

The  statements  in  this  test  are  those  which  are  frequently  heard  in  the  everyday 
remarks  people  make  •  *  * 

If  the  statement  is  true,  blacken  space  A  #  • 

If  the  statement  is  false,  blacken  the  space  under  the  letter  B,  etc 

If  the  statement  is  neither  true  nor  false,  but  you  agree  with  its  point  of  view, 
L  e.  have  a  preference  for  it,  blacken  the  space  under  G 

If  the  statement  is  neither  true  nor  false,  but  you  disagree  with  its  point  of  view, 
i.  e„  do  not  have  a  preference  for  it,  blacken  the  space  under  the  letter  D. 

If  none  of  these  ways  describes  your  reaction  to  the  statement,  you  may  bbekea 
the  space  under  tlx  letter  E. 

No  time  limits  are  set. 

(3)  Scorimj. — This  scoring  formula  is  R  —  W+20,  in  which  R  refer* 
to  the  positively  weighted  responses  ami  W  refers  to  the  negatively 
weighted  responds. 

Statistical  results. — (1)  Item  ivlidity.  After  dividing  a  sample  of 
1,014  answer  sheets  into  two  random  halves,  the  responses  to  the  item* 
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were  correlated  with  the  primary  pilot  graduation-elimination  criterion. 
The  distributions  of  the  phi  coefficients  arc  shown  in  table  26.42.  Items 
are  not  included  if  one  response  is  chosen  by  more  than  90  percent  or 
less  than  10  percent  of  the  examinees. 


Tablk  26.42. —  Distribution  of  validity  phi  coefficients  for  Social  Concepts  Test , 
CE512A,  based  on  pilots  in  primary  training,  using  graduation-elimination  os  the 

criterion 


CM 

Oddi1 

Even*1 

PU 

Odd.' 

Er«u» 

/ 

/ 

f 

t 

ots-o.tr . 

1 

t 

20 

37 

97 

120 

j.if. 

1 

2 

16 

40 

88 

114 

/  1HHW  fifth  . . . 

_ Msie 

w«p*idA 

9S 

M 

IS 

'l 

0.12-0.14  . 

(■  A0<0'-(  00*)  t 

0.09-0.11  . 

)  -  A  ftiV  ?  A  A7i 

0.06-0.00  . 

f  A  1?W  A  1ft) . 

O.OS-O.OS  . 

( — o.ts>-c — i o.ii)  . 

0.00-0.02  . . 

cW  *  —on  i.  ,i.u 

Tout  ....  . 

509 

-me 

494 

1943. 


In  interpreting  these  phi  coefficients,  it  can  be  said  that  for  an  N 
of  507,  a  phi  of  0.09  is  significant  at  approximately  the  5  percent  level, 
and  a  phi  of  0.11  is  significant  at  the  1  percent  level  of  confidence.  In 
the  odds  sample,  32  phis  exceed  the  5  percent  level  of  confidence  and 
20  exceed  the  1  percent  level.  In  the  evens  sample,  the  corresponding 
figures  are  39  and  11.  There  was  almost  no  correlation  between  the  two 
sets  of  significant  items,  with  but  three  keyed  responses  in  common. 
For  one  item  a  response  was  keyed  positive  in  one  key  and  negative 
in  the  other. 

Evaluation. — Since  there  are  only  two  keyed  responses  in  common  in 
the  odds  and  evens  samples,  a  cross-validation  of  scores  using  the  two 
samples  probably  would  show  a  very  low  validity  for  the  primary  pilot 
criterion. 

Traits  measured  by  this  test  arc  believed  to  be  more  important  in  com¬ 
bat  than  in  training.  Since  combat  criteria  are  difficult  to  obtain,  proper 
validation  of  these  tests  has  not  been  effected. 

Survey  of  Personal  Attitudes,  CE508B  *• 

The  purpose  of  this  test  is  to  measure  readiness  for  combat  duty  in 
terms  of  the  soldier’s  affective  organization  and  orientations. 

It  is  assumed  in  this  test  that  certain  established  attitudes  or  affective 
habits  arc  conducive  to  efficiency,  stability,  and  endurance  in  combat 
situations,  while  others  arc  not.  It  is  further  assumed  that  by  examining 
the  student's  likes  and  dislikes  for  carefully  selected  'words  with  ap¬ 
propriate  affective  connotations,  one  may  obtain  a  partial  measure  of  his 
emotional  preparedness  for  combat  duty. 

Description — In  this  test  the  examinee  is  presented  with  series  of 
four  words  each,  designed  to  have  affective  connotations.  The  examinee 

M  DrrcivpH  *1  PipMuiol  Kunrck  UMt  N«.  I.  CkWt  (MrthktMn:  Lt.  VWu  K. 

CiH  D«uU  K.  Soper.  * 
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indicates  which  word  in  each  series  is  most  pleasant  cr  most  unpleasant. 
An  attempt  was  made  to  collect  an  adequate  number  of  the  following 
classes  of  words : 

(1)  Those  which  have  had  their  unpleasant  implications  enhanced  or 
colored  by  the  war  situation. 

(2)  Those  which  have  had  their  pleasant  implications  enhanced  or  ; 
colored  by  the  war  situation. 

(3)  Those  which  have  not  had  their  implications  affected  by  the  war  | 

situation.  ; 

The  words  arc  grouped  in  units  of  four  in  such  a  way  that:  (1)  each 
unit  contains  only  words  of  apparent  equal  pleasantness  or  unpleasant*  ; 
ness,  and  (2)  each  unit  contains  both  words  which  have  had  their  affec*  ! 
tive  coloring  changed  or  enhanced  by  the  war  situation  and  words  which  * 
have  not  been  affected. 

(1)  Internal  characteristics. — The  test  has  two  parts.  Part  I  consists  i 
of  60  groups  of  words  which  have  unpleasant  associations  or  meanings  ; 
for  most  persons.  One  such  group  of  words,  for  example  is:  "disease,  j 
slink,  selfishness,  ugly."  Another  group:  “Jerry,  German,  Hun,  Nazi."  1 
Part  II  of  the  test  consists  of  60  groups  of  words  which  have  pleasant  : 
associations  or  meanings.  Examples  of  these  items  are :  "freedom,  power, 
peace,  ability;”  and  "airspeed,  velocity,  speed,  rate." 

(2)  Administration. — The  following  are  excerpts  from  the  directions 
to  part  I : 

The  purpose  of  this  test  is  to  learn  what  words  are  most  unpleasant  to  you 
and  what  words  are  most  pleasant  Experience  has  shown  that  one's  feelings  i 
toward  various  aspects  of  life  have  a  bearing  on  success  in  cadet  training. 
There  are  no  right  or  wrong  answers  •  *  •  You  are  to  read  the  words 
carefully  and  indicate  under  the  appropriate  letter  on  the  answer  sheet  the  one 
word  in  each  group  which  you  find  is  most  unpleasant  If  no  word  in  a  group 
is  unpleasant,  then  indicate  the  word  which  comes  nearest  to  being  unpleasant  1 
Be  sure  to  indicate  one  word  and  only  one  in  each  group  of  four  words.  ! 

In  part  II,  the  examinee  indicates  the  one  word  which  he  finds  to  be 
most  pleasant 

(3)  Scoring. — An  a  priori  key  was  constructed  in  the  following  manner.  | 
Positive  weights  were  given  to  those  responses  which  were  thought  to 
evidence  readiness  for  combat.  For  example,  in  part  I  in  the  series  "to( 
injure,  to  hurt,  to  wound,  and  to  harm,”  to  wound  was  weighted  posi*, 
tively.  In  part  II,  in  the  scries  "letters,  postcards,  correspondence,  and 
communications,"  Utters  was  weighted  positively.  The  total  score  is  the: 
sum  of  the  positively  weighted  responses. 

Statistical  results.  (1)  Distribution  statistics. — Distribution  statistic* 
arc  given  in  table  26.43.  i 

(2)  Reliability  coefficient. — An  estimate  of  reliability  estimated  by 
correlating  part  I  with  part  II  is  0.14.  This  low  correlation  indicates  either 
that  the  two  parts  do  not  measure  the  same  thing  or  that  the  part  scores 
themselves  are  unreliable. 


mrr% 


Tabu  26.43.  -  Distribution  constants  for  Survey  of  Personal  Attitudes,  CES09B, 

based  upon  870  classified  pilots * 

I 

1  Tented  in  June 

(3)  Test  validity. — Validation  results  for  both  parts  of  the  test  are 
given  in  table  26.44. ' 

Tabu  26.44.—  Validity  data  for  Survey  of  Personal  Attitudes,  CE508B  based  upon 


pilots  in  primary  training  (N,—870;  pt-0J9)x 


Evaluation. — This  technique  designed  for  measuring  "readiness  for 
combat"  did  not  discriminate  between  pilot  graduates  and  eliminces,  at 
least  in  primary  training.  This  may  be  due  to  a  number  of  reasons.  While 
good  combat  pilots  are  aggressive,  self-assertive,  and  happy-go-lucky, 
these  traits  may  not  be  susceptible  to  measurement  by  a  comparatively 
subtle  testing  instrument.  A  second  defect  might  be  in  using  the  a  priori 
method  of  keying  words  that  have  varying  connotations.  If  an  empirical 
key  were  established,  this  test  might  prove  useful  in  testing  the  original 
hypothesis.  Combat  criteria,  however,  are  needed  for  item  and  score 
validation  of  a  test  of  this  type. 

Inventory  of  Attitudes,  CE518A  |f 

This  test  is  designed  to  measure  certain  personality  traits  that  are 
believed  to  be  conducive  to  the  development  of  psychoncuroses,  particu¬ 
larly  under  near-combat  and  combat  conditions. 

Description. — A  general  orientation  to  the  selection  of  items  was 
effected  by  using  a  digest  of  personality  traits  gleaned  from  "War 
Neuroses  in  North  Africa"  (3)  by  Grinker  and  Spiegel  and  from  other 
field  studies. 

( 1 )  Internal  characteristics. — The  test  is  divided  into  three  parts.  Par! 
I  contains  91  items  requesting  the  examinee  to  express  his  opinions  of 
the  importance  or  significance  of  different  combat  or  near-combat  duties 
and  responsibilities.  Two  examples  from  part  I  are: 

With  regard  to  combat  assignments,  my  concern  as  to  whether  I  will  have  com¬ 
petent  leadership  from  superior  officers: 

A.  Is  very  great 

B.  Is  considerable. 

"  DtrrWprd  it  FijrcBalofical  KmcmxS  Unit  Ha.  t.  Qurt  wtrikntu  U.  Vivian  X.  WUr, 
C*pc  Dmm  £  Swptr. 


1944  at  Psychological  RcMircs  Unit  Ha.  I. 


C  Is  slight. 

D.  Is  absent 

I  believe  the  war  is  hardest  on: 

A.  The  fighting  men. 

B.  Parents  of  fighting  men. 

C  Children  of  fighting  men. 

D.  Wives  and  sweethearts  of  fighting  men. 

Part  II  includes  85  items  requesting  the  examinee  to  give  a  direct 
evaluation  of  his  personality.  Typical  items  from  part  II  are. 

If  one  of  my  superiors  keeps  picking  on  me,  I  shall: 

A.  Tell  him  exactly  what  I  think. 

B.  Tell  my  squadron  mates  what  I  think  of  him. 

C  Say  nothing,  but  blow  off  steam  some  other  way. 

D.  Try  not  to  be  disturbed  by  the  matter. 

When  I  fall  in  love  or  make  friends,  the  relationships' can  best  be  described  as: 

A.  Very  intense  and  short  lived 

B.  Very  intense  and  long  lived 

C  Casual  and  short  lived 

D.  Casual  and  long  lived 

The  48  items  in  part  III  require  the  examinee  to  give  his  judgment  of 
his  friends’  or  family’s  evaluation  of  his  personality.  Typical  items  of 

part  III  are: 

My  friends  and/or  family  seem  to  think  I  am  inclined: 

A.  Never  to  give  upi 

B.  To  give  up  only  after  great  effort 

C  To  give  up  easily. 

D.  To  give  up  very  easily. 

My  instructors  usually  seem  (seemed)  to  like  me: 

A.  Very  wed 

B.  Fairly  wed 

C  Very  little. 

D.  Not  at  ad 

(2)  Administration. — The  following  are  excerpts  from  the  directions: 

In  this  booklet  you  are  asked  for  certain  information  about  some  of  your  attitudes 
and  opinions.  This  is  not  a  test  in  the  ordinary  sense;  there  are  no  right  answers 
except  the  ones  which  most  truly  reflect  your  own  particular  attitudes  and  opinions. 

To  a  large  extent  your  success  in  flying  depends  on  how  well  you  are  understood 

those  in  charge.  All  of  the  information  asked  for  in  this  booklet  is  for  the  pur¬ 
pose  of  aiding  your  superior  officers  in  understanding  you.  It  is  to  your  own  advan¬ 
tage,  therefore,  to  indicate  your  answers  to  the  items  in  this  booklet  as  carefully, 
completely,  and  honestly  as  you  can. 

(3)  Scoring. — There  is  no  a  priori  scoring  key.  An  attempt,  de¬ 
scribed  below,  was  ntadc  to  develop  an  empirical  key. 

Statistical  results. — Only  item  validity  data  are  available  for  this  test, 
based  upon  pilots  originally  tested  in  May  1944  at  Psychological  Research 
Unit  No.  I. 

(1)  It  c iw  validity. — After  dividing  a  sample  of  752  answer  sheets  into 
two  random  halves,  the  responses  were  correlated  with  the  primary  pilot 


graduation-elimination  criterion.  Results  are  shown  in  table  26.45.  Items 
are  not  included  if  one  response  is  chosen  by  more  than  90  percent  or 
less  than  10  percent  of  the  examinees. 


Tabu  26.45. —  Distribution  of  validity  phi  coefficients  for  Inventory  of  Attitudes, 
CE518A,  based  on  pilots  in  primary  training ,  using  graduation-elimination  os  the 

criterion 


In  interpreting  these  phi  coefficients,  it  can  be  said  that  for  an  N 
of  376,  a  phi  of  0.10  is  significant  at  approximately  the  5  percent  level, 
and  a  phi  of  0.13  is  significant  at  the  1  percent  lc  el  of  confidence.  In 
the  odds  sample,  77  phis  exceed  the  5  percent  level  of  significance  and 
23  exceed  the  1  percent  level.  In  the  evens  sample,  the  corresponding 
figures  are  83  and  20. 

Evaluation. — An  examination  of  the  items  that  were  keyed  in  the  same 
direction  on  both  samples  does  not  reveal  any  pattern  of  behavior  that 
would  differentiate  successful  from  unsuccessful  pilots  at  the  primary 
training  level.  Further  validation  is  necessary  using  combat  success  or 
failure  as  the  criterion,  since  this  test  was  constructed  on  that  premise. 


Conduct  of  the  Wsr  Test,  CE520A  ** 

This  test  is  designed  to  measure  the  extent  of  the  examinee’s  informa¬ 
tion  concerning  the  conduct  of  the  war ;  that  is,  military  events,  methods, 
objectives,  and  principles.  This  procedure  is  based  on  the  hypothesis 
that  men  who  are  motivated  by  a  patriotic  interest  in  the  war  and  by  pride 
in  military  achievement  will  acquire  more  information  concerning  the 
conduct  of  the  war.  It  is  believed  important  to  measure  military  motiva¬ 
tion,  since  aviation  students  go  through  a  long  and  strenuous  training 
period,  followed  by  the  dangers  of  combat 
Description— One  hundred  twenty-five  items  were  constructed  for 
this  test  based  on  political  and  military  events  that  took  place  during 
the  war  period,  1939  through  1944.  Some  aviation-information  items  *re 

included  also.  ... 

(1)  Internal  characteristics.— The  lest  consists  of  information  items 

such  as  the  following: 


Germany  invaded  Poland  in: 

A.  October,  1938. 

B.  January,  1938. 


-  Df-rrloped  at  V1*  *■  CW<  ca»t  riWac. :  *€*•  OrrO* 


761 


G  September,  1939. 

D.  December,  1939. 

E.  Don't  know. 

Secretary  of  Navy  Kiv  v  was  succeeded  by : 

A.  Patterson. 

B.  Welles. 

G  ForrestaL 
D.  WickartL 
&  Don't  know. 

(2)  Scoring. — The  scoring  formula  for  this  test  is  R— W/3.  For 
validation  purposes,  rights  and  wrongs  were  scored  separately. 

Statistical  results. — The  data  that  follow  are  for  examinees  tested  at 
Psychological  Research  Unit  No.  1  in  the  period  June  to  August  1944. 

(1)  Distribution  statistics. — An  example  of  distribution  statistics  ob¬ 
tained  on  this  test  are  given  in  table  26.46. 

Tabli  26.46. —  Distribution  constants  for  Conduct  of  the  War,  CE520A  for  a  sample 
of  673  classified  pilots  in  primary  trainmg 


(2)  Item  validity. — After  dividing  a  sample  of  656  answer  sheets  into 
two  random  halves,  the  responses  to  the  items  were  correlated  with  the 
primary  pilot  graduation-elimination  criterion.  Results  are  shown  in 
Table  26.47.  Items  arc  not  included  if  one  response  is  chosen  by  more 
than  90  percent  or  less  than  10  percent  of  the  examinees. 

Tabu  26.47. —  Distribution  of  validity  phi  coefficients' for  Conduct  of  the  War  Test, 
CE520A,  based  on  pilots  in  primary  training,  using  groduation-eiitminatiom 

as  the  criterion 


0.30-9.2* 

•.It-0.lt 

to* -on 

•  04-0  07 
0.00  O.ftJ 


*  Af.aiJft, 

In  interpreting  these  phi  coefficients,  it  can  be  said  that  for  an  N  of 
328  a  phi  of  0.1 1  is  significant  at  approximately  the  5  percent  level,  and 
a  phi  of  0.15  is  significant  at  the  1  percent  level  of  confidence.  In  the  odds 
sample,  37  phi’s  exceed  the  5  percent  level  of  significance  and  10  exceed 
the  1  percent  level.  In  the  evens  sample,  the  corresponding  figures  are 
43  and  9. 

(3)  Test  validity. — Validation  data  were  obtained  for  a  sample  of 
673  classified  pilots  in  primary  training.  The  results  are  shown  in  table 
26.48. 


T 


Tabli  26.48. —  Validity  of  Conduct  t ,f  the  War,  CE520A,  based  on  graduation- 
elimination  from  primary  pilot  training  (Nt—673,  p,—0.68) 


Score 

M, 

M. 

sn, 

*',**»»  . 

40.  JJ 

41.54 

13.03 

-o.os 

-o.os 

Wrong*  . 

74.90 

7S.S3 

ISM 

-.04 

-.04 

*  Atnimlnf  »n  uitrctti-ictcd  (twin*  tuodard  deviation  of  2.04 


Evaluation. — The  hypothesis  that  men  who  arc  motivated  by  a  patriotic 
interest  and  have  therefore  obtained  more  knowledge  concerning  the 
conduct  of  the  war  will  be  more  successful  in  air-crew  training,  is  re¬ 
jected.  This  conclusion  is  based  on  the  negative  correlations  between  the 
scores  of  763  pilots  in  primary  training  and  the  graduation-elimination 
criterion.  This  is  in  line  with  the  validities  of  two  achievement  tests: 
Geography,  AS104  (X2),  and  History,  AS1S3  (X3),  both  of  which  were 
slightly  negative.  Statistical  analysis  of  the  questions  in  the  test  reveals 
that  they  were  too  difficult,  even  for  persons  familiar  with  military  and 
political  events. 

Home  Front  Attitude  Inventory,  CE446A  *• 

The  purpose  of  this  test  is  to  measure  a  certain  aspect  of  predisposition 
to  combat  neurosis.  Lack  of  confidence  in  the  home  front,  according  to 
Grinker  and  Spiegel  (3),  was  frequently  j!  served  in  victims  of  mental 
disorders  in  the  Tunisian  campaign.  Reports  from  combat  theaters  re¬ 
peatedly  cite  evidence  of  the  adverse  effect  on  military  morale  of  incidents 
at  home  that  lead  soldiers  to  believe  that  civilians  are  not  doing  their 
part  in  the  war.  This  test  is  an  attempt  to  explore  this  emotionally  signifi¬ 
cant  area  which  is  not  adequately  covered  by  any  tests  in  the  battery. 

Description. — The  test  consists  of  100  items,  each  of  which  is  a  two- 
or  three -sentence  description  of  the  behavior  of  a  fictitious  civilian  in 
a  specific  situation.  The  examinee  is  asked  to  indicate  for  each  sample  of 
civilian  behavior  whether  it  is  "very  common,"  "fairly  common,"  "fairly 
uncommon,"  or  "very  uncommon."  He  is  also  asked  to  indicate  whether 
he  believes  that  the  civilian  conduct  is  "very  justifiable,"  "fairly  justifi¬ 
able,”  "fairly  unjustifiable,"  or  “very  unjustifiable."  Responses  to  the 
test  items  should  indicate  individual  differences  regarding  (1)  what  is 
felt  to  be  the  typical  wartime  behavior  of  civilians;  (2)  how  various 
wartime  behavior  tendencies  of  civilians  are  evaluated  by  aviation  stu¬ 
dents;  and  (3)  the  degree  of  confidence  which  is  lreld  regarding  the 
support  of  the  war  effort,  by  civilians. 

(1)  Internal  characteristics. — Test  items  cover  such  areas  of  activity 
as  the  economic  life  of  the  country,  individual  economic  interests,  sexual 
needs  and  romantic  interest,  political  actions,  degree  of  civilian  sacrifices, 

•  DtnltH  *»  y*rtW***c*l  I«wrU  U»U  N*.  I. 
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etc.  There  arc  three  parts  to  the  test.  Part  I  has  37  items,  part  II,  37  items, 
ami  part  III,  26  items.  Three  typical  items  are: 

Mr.  Z  was  the  leader  of  a  union  in  which  members  were  receiving  more  pay  than 
ever  before  in  their  lives.  There  were  some  causes  for  grievance  among  the  workers, 
but  he  smoothed  them  over  so  as  to  avoid  interfering  with  the  war  effort. 

Mrs.  B  had  always  been  a  little  afraid  of  the  sight  of  blood  and  was  somewhat 
timid  about  giving  to  the  Blood  Bank.  However,  when  she  was  reminded  that  her 
blood  might  save  a  soldier's  life,  she  agreed  to  donate. 

G  was  in  love  with  a  beautiful  girl  when  he  was  drafted.  The  letters  which  he 
received  from  her  continued  to  be  warm  and  affectionate  even  though  she  had 
fallen  in  love  with  someone  else.  She  felt  it  would  be  unpatriotic  to  "let  him  down 
while  he  was  out  there  fighting.” 

(2)  Administration. — The  following  are  excerpts  from  the  directions: 

This  is  a  test  of  your  judgment  about  events  which  arc  taking  place  among 
civilians  and/or  which  may  take  place  in  the  future. 

Each  paragraph  is  assigned  four  different  numbers  on  your  answer  sheet.  The 
corresponding  numbers  will  be  in  your  test  booklets  to  the  left  of  each  paragraph. 
Which  number  you  use  on  the  answer  sheet  will  depend  upon  your  judgment  of 
how  common  the  described  behavior  (thinking,  feeling,  talking,  or  acting)  is.  The 
first  of  the  numbers  will  be  used  if  you  think  it  is  very  commtn;  the  second  number 
will  be  used  if  you  think  it  is  f?irly  common ;  the  third  numbers  will  be  used  if  you 
think  it  is  fairly  uncommon;  and  the  fourth  number  will  be  used  if  you  think  it  is 
very  uncommon. 

After  you  have  decided  which  number  to  use,  you  will  have  to  decide  which  letter 
to  mark  after  the  number  you  have  chosea  The  letter  you  mark  will  depend  on  your 
judgment  of  how  justifiable  such  civilian,  behavior  is.  You  will  mark  the  A  space 
if  you  think  it  is  very  justifiable,  the  B  space  if  you  think  it  is  fairly  justifiable,  and 
the  C  space  if  you  think  it  is  fairly  unjustifiable,  and  the  D  space  if  you  think  it  is 
very  unjustifiable.  The  E  space  will  never  be  marked. 


(3)  Scoring. — The  original  system  was  such  that  each  of  the  16  alterna¬ 
tives  to  an  item  of  the  test  was  given  weight  in  the  scoring.  It  assumes 
that  all  items  are  equally  discriminating  and  that  a  high  positive  cor¬ 
relation  between  judgments  of  frequency  of  occurrence  and  judgments  of 
justifiability  is  indicative  of  optimum  morale.  Weights,  as  follows,  were 
assigned  to  responses  to  each  item. 

Item  A  B  C  D 


1  8  6  4  0 

2  6  5  3  4 

3  4  3  5  6 

4  0  4  6  8 


Because  of  the  great  practical  difficulties  with  this  scoring  method,  it 
was  dropped. 

Statistical  results. — The  only  statistical  data  reported  are  item-validity 
studies,  based  upon  the  responses  of  examinees  tested  at  Psychological 
Research  Unit  No.  1  in  August  1944. 

(1)  Item  validity. — After  dividing  a  sample  of  740  answer  sheets  into 
two  random  halves,  the  responses  to  the  items  were  correlated  with  the 
primary  pilot  graduation-elimination  criterion.  With  this  size  of  sample, 


it  is  feasible  to  obtain  item  validities  for  judgments  of  commonness  of 
behavior  only,  since  only  a  small  proportion  of  examinees  selects  any 
one  of  the  16  responses  concerning  justifiability  of  behavior.  The  dis¬ 
tributions  of  the  phi  coefficients  are  shown  in  table  26.49.  Items  are  not 
included  if  one  response  is  chosen  by  more  than  90  percent  or  less  than 
10  percent  of  the  examinees. 


Table  26.49. —  Distribution  of  validity  phi  coefficients  for  Home  Front  Attitudes, 
CE446A,  based  on  pilots  in  primary  training,  using  graduation-elimination  as  the 

criterion 
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In  interpreting  these  phi  coefficients,  it  can  be  said  that  for  an  N  of 
370  a  phi  of  0.10  is  significant  at  approximately  the  5  percent  level, 
and  a  phi  of  0.13  is  significant  at  the  1  percent  level  of  confidence.  In 
the  odds  sample,  16  phis  exceed  the  5  percent  level  of  confidence  *nd 
4  exceed  the  1  percent  level.  In  the  evens  sample,  the  corresponding 
figures  are  30  and  14.  For  no  item  did  the  two  keys  agree. 

Evaluation. — Since  little  empirical  evidence  is  available,  there  are  few 
conclusions  that  can  be  stated  about  this  test.  What  little  evidence  does 
exist,  does  not  support  the  original  hypothesis.  The  novel  feature  of 
this  instrument— the  double  response  and  scoring  based  on  agreement 
of  two  responses — could  be  suggestive  of  other  uses  to  which  these 
devices  might  be  adapted.  This  scoring  system,  however,  is  impractical,  if 
not  prohibitive,  for  use  with  the  machine-scoring  of  large  numbers  of 
answer  sheets. 

EVALUATION  OF  MOTIVATION  MEASURES 

Research  in  the  field  of  motivation  has  verified  previous  results  and 
has  revealed  much  that  was  formerly  unknown.  The  two  methods  of  ap¬ 
proach  utilized  in  air-crew  classification,  one  subjective  and  the  other 
objective,  provide  an  interesting  contrast,  both  of  techniques  and  of 
results. 

Analysis  of  the  voluminous  data  obtained  for  the  Preference  Blank 
shows  conclusively  that  expressed  preferences  and  strength  of  interest 
for  pilot  or  bombardier  training  do  not  correlate  appreciably  with  either 
the  pilot  or  bombardier  criteria  or  with  the  classification  tests  that  have 
high  correlations  with  the  criteria.  On  the  other  hand,  preferences  and 
strength  of  interest  for  navigator  training  do  correlate  significantly  and 
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substantially  with  the  criterion,  and,  in  addition,  with  the  navigator 
staninc,  and  with  individual  tests  that  correlate  well  with  the  navigator 
criterion.  This  may  mean  that  those  interested  in  navigation  also  have 
insight  into  their  abilities  and  temperament  and  appreciate  the  demands 
of  the  job  they  desire  to  perform. 

More  objective  instruments,  such  as  the  Satisfaction  test  and  the  two 
aviation-interest  tests  that  were  introduced  into  the  classification  battery — 
General  Information  and  Biographical  Data — have  higher  correlations 
with  the  pilot  criterion.  This  substantiates  the  conclusion  that  objective 
techniques  are  needed  for  an  effective  assessment  of  an  examinee's 
interests. 

While  not  enough  data  are  available  to  make  a  definite  generalization 
on  the  point,  the  available  results  from  the  attitude  and  interests  in¬ 
ventories  suggest  that  the  factors  that  primarily  account  for  fighter-pilot 
success  are  also  factors  that  were  found  to  account  for  success  in  primary 
and  basic  training.  These  factors  are:  experiences  in  mechanics  and 
active  sports,  general  tendencies  toward  recklessness,  adventurousness, 
extravagance,  and  a  devil-may-care  attitude.  It  is  probable  that  the 
bomber  pilot  requires  some  of  these  traits;  but  that,  in  addition,  he  must 
possess  more  special  social  intelligence  (leadership),  and  be  more  con¬ 
scientious,  methodical,  thorough,  cautious,  and  persevering.  There  are  also 
data  indicating  that  pilots  assigned  to  heavy-bomber  training  have  more 
interest  in  navigator  training  than  do  fighter  pilots. 

Traits  measured  by  the  combat-readiness  tests  are  expected  to  be  more 
important  in  combat  than  in  training.  Since  combat  criteria  are  difficult 
to  obtain,  proper  validation  of  these  tests  has  not  been  effected.  There¬ 
fore,  it  has  not  been  possible  to  ascertain  their  usefulness. 
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CHAPTER  IDBIMEH 


Biographical  Data1 


INTRODUCTION 

Research  by  Civil  Aeronautics  Administration 

The  decision  of  Army  Air  Forces  psychologists  to  construct  a  bio¬ 
graphical  information  blank  for  air-crew  candidates  was  influenced  by 
studies  in  this  area  sponsored  by  the  Civil  Aeronautics  Administration 
(2,  3).  Prior  to  Pearl  Harbor,  research  was  conducted  at  Tulane  Uni¬ 
versity  and  the  University  of  North  Carotina — with  data  gathered  at  the 
Naval  Air  Station,  Pensacola,  Ha.— under  the  direction  of  Dr.  H.  M. 
Johnson  (2).  Statements  of  biographical  information  and  of  interests 
obtained  through  personal  interviews  with  480  students  at  the  Naval  Air 
Station  and  their  responses  to  the  125  items  of  the  Bcrnrcutcr  Personality 
test  were  analyzed.  The  answers  to  the  Bcmreuter  questionnaire  were 
regarded  as  biographical  information  in  the  sense  that  they  purported 
to  indicate  some  of  the  individual's  likes  and  dislikes,  habits  of  social 
adjustment, interests,  etc. 

An  item  validation  of  the  Bcmreuter  questionnaire  revealed  nine 
potentially  useful  items.  When  weighted  in  the  best  possible  manner, 
these  items  yielded  a  multiple  correlation  of  0.35  (unshrunken)  with  the 
pass-fail  criterion  for  pilot  training.  The  nine  items  were  combined  with 
other  items  of  biographical  information  gathered  in  the  personal  inter¬ 
view.  The  latter  were  more  objective,  concerning  facts  about  education, 
religion,  occupations,  and  athletics.  The  coefficient  of  multiple  correla¬ 
tion  between  the  15  most  predictive  biographical-information  items 
(including  the  seven  most  predictive  Bcmreuter  items),  optimally 
weighted,  and  the  pass-fail  pilot  criterion  for  a  sample  of  approximately 
280  pilots  was  0.51.  In  obtaining  this  figure,  shrinkage  was  estimated, 
but  the  items  had  not  been  administered  to  a  new  sample.  Experience  in 
the  Army  Air  Forces  has  shown  the  utmost  importance  of  the  validation 
of  any  weighted  composite  score  on  an  entirely  new  sample  of  adequate 
size. 

Studies  in  Industry 

In  addition  to  the  promising  results  of  these  early  CAA  studies,  there 
had  been  experimentation  in  the  industrial  field  which  tended  to  confirm 
belief  in  the  value  of  biographical  information  for  predicting  success  in  a 
particular  area  of  activity.  For  example,  studies  involving  life-insurance 

1  Written  by  TeckTSjt  S*ntord  J,  Stock. 
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company  employees  revealed  that  certain  items  of  information  from  per¬ 
sonal  and  occupational  histories  were  predictive  of  success  or  failure  in 
specific  life-insurance  jobs.  This  conclusion  was  reached  through  an  em¬ 
pirical  evaluation  of  application  forms  and  of  interview  data. 

An  Extreme  View  of  the  Biographical  Approach 

An  extreme  point  of  view  expressed  by  Guthrie  (1),  can  be  intro¬ 
duced  here  to  support  the  biographical-data  type  of  approach,  without 
necessarily  subscribing  to  his  theory  of  personality  or  his  convictions 
concerning  the  usefulness  of  specific  versus  general  information  as  a 
basis  for  prediction.  He  asserts  that  : 

The  useful  categories  in  describing  what  may  be  expected  of  a  man  include  his 
skills  which  can  be  often  readily  measured.  They  include  his  types  of  adjustment 
described  in  terms  of  the  situations  to  which  he  has  been  exposed.  He  is  a  hardened 
campaigner,  an  experienced  broker,  an  experienced  carpenter;  he  has  been  for  10 
years  a  head  waiter.  If  we  know  these  types  of  experience,  we  may  safely  assume 
that  he  has  learned  skills  that  meet  the  problems  of  these  trades  and  occupations. 

In  speaking  of  application  blanks,  the  same  author  points  out  that 

*  •  •  wt  may  require  the  statements  of  sponsors  as  to  his  (applicant’s)  intro¬ 
version  or  extraversion,  his  genera!  honesty,  his  loyalty,  his  industry.  But  the  useful 
information  on  this  blank  is  more  likely  to  be  his  past  record  of  occupation,  his 
specific  skills,  his  financial  status,  his  marital  and  police  record.  His  past  affiliations, 
political  snd  religious,  offer  better  and  more  specific  predictions  of  his  future  than 
any  of  die  traits  that  we  usually  think  of  as  personality  traits  (1,  p.  66). 

• 

Job  Analyst*  Data 

An  analysis  of  the  jobs  performed  by  pilot,  bombardier  and  navigator 
encourages  the  hypothesis  that  persons  with  certain  types  and  combina¬ 
tions  of  educational,  physical,  social,  recreational,  and  occupational  his¬ 
tories  would  be  likely  or  unlikely  to  possess  the  necessary  characteristics 
for  air-crew  success  . 

The  task  of  the  pilot. — As  an  example,  consider  one  aspect  of  the  pilot’s 
job — manipulation  of  controls.  The  flight  controls  include  the  stick, 
rudder  control,  flap  control,  throttle,  trimming  controls,  propeller  pitch, 
and  brakes.  Operation  of  these  controls  requires  coordinated  movements 
of  arms  or  legs  or  both.  Changing  sensory  stimuli  must  be  followed  by 
proper  motor  responses  coordinated  with  them.  We  know  that  although 
both  large  and  small  muscle  groups  are  used,  the  large  muscle  groups 
are  here  most  important.  We  also  know  that  the  movements  arc  varied, 
depending  on  the  immediate  situation;  that  they  may  be  planned  and 
deliberate  but  at  times  must  be  prompt  and  automatic 

It  is  quite  reasonable  to  suppose  that  individuals  who  have  successfully 
engaged  in  activities  involving  similar  motor  patterns  will  have  superior 
aptitude  for  piloting.  Even  with  an  incomplete  analysis  of  the  pilot’s 
motor  functions,  wc  arc  given  sufficient  leads  for  framing  questions, 
asking  accordingly  for  alluctic  and  occupational  experience.  For  example. 


the  man  who  is  proficient  in  tennis,  basketball,  motorcycling,  or  ice 
hockey  might  well  be  expected  to  have  certain  qualities  of  motor  co¬ 
ordination  and  control  which  may  assist  him  substantially  in  learning  to 
be  a  pilot.  The  same  can  be  said  about  the  individual  who,  in  civilian 
life,  was  an  acrobat  or  operator  of  a  complex  machine,  such  as  a  crane. 

The  tasks  of  bombardier  and  navigator. — The  bombardier  and  navi¬ 
gator,  in  contrast  to  the  pilot,  use  small-muscle  groups  predominantly. 
The  bombardier  in  operating  the  bombsight  and  releasing  bombs  must 
make  precise  movements  involving  fine  adjustments.  Precise  eye-hand 
coordination  is  required.  In  using  many  of  the  navigational  instruments, 
eye-hand  or  eye-finger  coordination  is  required  For  example,  in  a 
cramped  position,  and  holding  the  sextant  with  the  left  hand,  the  navi¬ 
gator  must  center  a  comparatively  sensitive  bubble  in  its  chamber  and, 
using  his  right-hand  fingers  on  the  adjustment  knob  bring  the  star  under 
observation  into  the  bubble.  Furthermore,  he  must  dock  his  recorder 
at  the  instant  the  bubble  is  centered  and  the  star  is  in  the  center  of  the 
bubble.  This  description  augurs  a  history  of  activities  demanding  motor 
coordination  different  from  that  of  the  potentially  successful  pilot  Again, 
it  would  seem  reasonable  that  an  individual  proficient  in  motor  patterns 
similar  to  those  of  navigator  and  bombardier  would  have  a  superior 
aptitude  for  these  positions. 

General  experience  and  education  —We  can  generalize  this  conception. 
Possibly,  it  would  be  fruitful  to  delve  into  every  phase  of  a  candidate's 
biography  that  seems  to  have  a  connection,  direct  or  indirect,  with  the 
characteristic  habits  required  for  success  in  the  various  air-crew  jobs. 
On  this  basis,  for  example,  questions  were  inserted  in  the  biographical 
data  blank  which  sought  to  reveal  the  examinee's  mechanical  interest  and 
experience,  because  it  was  known  that  the  pilot  must  have  an  under¬ 
standing  of  mechanical  principles  and  must  be  able  to  work  with  mechani¬ 
cal  devices.  Questions  were  asked  about  mathematical  knowledge,  extent 
of  education,  and  reading  habits,  because  it  was  known  that  the  navigator 
had  to  have  a  high  degree  of  numerical  proficiency;  and  it  was  assumed 
that  he  was,  on  the  whole,  more  intellectually  inclined  than  the  pilot 
These  examples  illustrate  the  approach  that  was  based  on  the  hypothesis 
that  specific  aptitudes  can  be  inferred  from  a  knowledge  of  experience 
and  background. 

Summary 

The  above  discussion  implies  the  premise  that  the  habits,  motor  or 
intellectual,  that  an  individual  has  learned  in  the  past  will,  by  transfer, 
akl  him  in  learning  air-crew  duties.  This  is  no  doubt  true.  But  it  does  not 
ptedude  another  premise,  that  what  he  has  done  before  successfully 
indicates  constitutional  aptitudes  for  learning  those  habits,  and  interest 
in  activities,  vocational  or  avocational,  which  yielded  him  satisfactions 
because  he  was  ready  to  do  well  in  them. 
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To  summarize,  it  can  be  said  that  aviation  psychologists  began  construc¬ 
tion  of  a  biographical  information  blank  because  of:  (1)  Previous  studies 
in  this  area  sponsored  by  the  Civil  Aeronautics  Arlministration,  (2)  ex¬ 
perience  in  industry,  (3)  job  analysis,  and  (4)  armchair  reasoning. 

Biographical  Data  Blank,  CE602A  * 

This  is  the  first  form  of  the  Biographical  Data  Blank. 

Description. — The  items  relate  to  individual  interests,  attitudes,  and 
background.  The  first  group  of  20  items  asks  for  information  about  the 
examinee’s  home  and  personal  history.  These  questions  are  designed  to 
reveal  such  facts  as  father’s  education  and  occupation,  parents’  national 
stock  and  marital  and  financial  status,  the  population  size  and  general 
location  of  the  area  where  the  examinee  lived,  the  examinee’s  religion, 
the  extent  and  emphasis  of  his  education.  The  next  10  questions  ask  the 
examinee  to  rate,  according  to  a  graduated  scale,  his  interest  in  various 
subjects  studied  in  school.  Questions  31  to  61  ask  him  to  indicate  the 
extent  of  his  interest  in  certain  activities  such  as  art,  literature,  music, 
radio,  science,  mathematics,  hiking,  smoking,  etc.  The  ne:U  12  items 
ask  for  the  degree  of  proficiency  he  possesses  in  sports.  Then  nine  items 
elicit  information  about  his  previous  employment  and  occupation  and  his 
occupational  preference.  Questions  86  to  95  concern  his  military  experi¬ 
ence,  including  previous  Air  Corps  jobs.  Questions  %  to  116  ask  for 
expressions  of  preference  for  military  and  civilian  air-crew  jobs.  The 
last  group,  117  to  150,  consists  of  positive  statements  about  controversial 
matters.  The  examinee  is  asked  to  indicate  his  degree  of  agreement  with 
the  statement.  Typical  statements  are:  "Skill  is  more  important  than 
judgment  in  flying;’’  "Discipline  should  be  as  strict  in  the  Air  Corps 
as  in  other  branches  of  the  army;’’  "A  pilot  who  has  had  .r^re  than  one 
drink  should  not  fly  a  plane.” 

(1)  Internal  characteristics. — Biographical  Data,  CE602A,  contains 
150  items  divided  into  the  8  sections  described  below. 

(2)  Administration. — AH  items  and  directions  are  included  in  the  test 
booklet.  The  directions  attempt  at  the  outset  to  change  the  set  of  the  ex¬ 
aminee  who  is  expecting  a  typical  test.  An  explanation  is  made  of  why 
information  about  the  student’s  background  is  important.  It  is  the  func¬ 
tion  of  the  directions  to  establish  rapport  so  that  honest,  straightforward, 
and  cooperative  answers  will  be  given.  All  examinees  are  allowed  to 
finish  the  blank,  which  requires  approximately  45  minutes. 

Following  are  the  pertinent  parts  of  the  directions  and  an  illustrative 
item  from  each  sect* on : 

In  this  blank  you  are  asked  for  certain  information  about  your  background,  your 
family,  your  home,  your  education,  your  interests,  and  your  attitudes.  This  is  not  a 
test  There  are  no  rirht  answers  except  the  answers  that  tell  the  truth  about 
you  red  f. 

*  DiTtUarit  at  PaycfcaWc'Val  XiKirtk  Uaii  Ha.  I.  Qlil  caatriWwi  U  Cat  LaanaM  F. 
Shafer. 
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To  a  large  extent,  your  future  success  as  a  pilot  will  depend  on  how  well  you  are 
understood  by  those  in  charge  of  your  flight  training.  All  of  the  information  asked 
for  has  been  shown  to  be  related  to  the  proper  training  of  pilots.  It  is  therefore  to 
your  own  interest  to  fill  out  this  blank  carefully  and  completely.  In  no  case  should 
any  part  be  omitted  •  •  •  Read  each  question  and  the  answers  that  follow  iL 
Among  the  several  suggested  answers,  select  the  one  that  best  answers  the  question 
for  you  *  *  *  Work  as  rapidly  as  you  can  without  being  careless.  Don't  think 
a  long  time  about  questions  that  ask  for  your  interest  and  opinions,  but  record  your 
first  impressions  rapidly.  Everyone  should  finish  the  blank,  and  answer  all  of  the 
questions  *  *  • 

Section  I 

Your  father  and  his  parents  were  chiefly  of : 

A.  American  stock. 

B.  Northern  European  stock.  (English,  Irish,  Scandinavian,  German,  etc) 

C  Southern  European  stock.  (French,  Italian,  Spanish,  etc) 

D.  Slavic  stock.  (Russian,  Polish,  Greek,  etc) 

E  Other. 

Section  It 

Directions:  Indicate  how  well  you  like  each  school  subject  listed  below  by  black* 
cning  the  spaces  as  follows : 

A.  Liked  the  subject  exceptionally  well 

B.  Liked  the  subject  somewhat 

C  Indifferent— did  not  care 

D.  Disliked  the  subject 

E  Never  studied  the  subject 

21.  Mathematics. 

22.  Sciences. 

23.  History. 

24.  English ;  literature 

Section  III 

Rank  the  following  6ve  from  A  (liked  best)  to  E  (liked  least) : 

54.  Writing  a  technical  report 

55.  Riding  horseback.  ' 

56.  Developing  a  business  system. 

57.  Repairing  a  radio. 

58.  Soliciting  contributions  for  charity. 

Section  IV 

Directions :  For  each  activity  listed  Ulow,  you  are  to  blacken  a  space  to  indicate 
bow  welt  you  perform  that  activity  according  to  the  following  scale: 

A.  Exceptionally  well 

B.  WdL 

C  Fairly  wdL 

D.  Poorly. 

E  Dr  not  engage  in  this  activity. 

Thus  if  you  play  football  fairly  well  blacken  the  space  under  C  in  Row  62;  if 
you  do  not  play  football  at  all  blacken  under  E 

62.  Football  rugby,  or  soccer. 

63.  Basketball  hockey,  or  lacrosse. 

64.  Baseball  or  softbaH 

65.  Boxing,  wrestling,  or  water  polo. 
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Section  V 

Directions :  Mark  any  of  the  following  types  of  work  which  you  have  done  at  any 
time,  and  for  which  you  have  received  remuneration.  (More  than  one  may  be 

marked.) 

79-A.  Manufacturing  industries  (machine  operator,  factory  hand,  textile 
worker,  etc). 

79-B.  Technical  trades  (baker,  elect,  ician,  radio  repairman,  etc). 

79-C.  Transportation  and  communication  (truck  driver,  linesman,  deckhand, 

etc). 

79-D.  Business  trades  (store  clerk,  salesman,  agent,  window  dresser,  etc). 

79-E.  Public  service  (fireman,  policeman,  forest  ranger,  soldier,  etc). 

Section  VI 

Directions:  Below  are  a  number  of  statements  about  which  there  are  wide  dif¬ 
ferences  of  opinion.  Indicate  your  personal  opinions,  whether  they  agree  with  those 
of  others  or  not  Tell  how  you  feel  about  the  statement  by  blackening  one  of  the 
spaces  as  before  according  to  this  scale 

How  much  flying  experience  have  you  had : 

91-A.  Never  been  in  the  air. 

91-B,  Have  flown  some  in  transport  ships. 

91-C  Fly  with  friends  occasionally. 

91-D.  Have  had  some  instruction. 

91-E.  Have  soloed. 

Section  VII 

Directions :  Below  are  listed  five  types  of  military  piloting.  Choose  the  one  of  the 
five  types  that  you  would  most  like  to  do,  and  blacken  the  space  under  A  for  its 
number.  Now  decide  which  you  would  like  next  best,  and  blacken  the  space  under 
B  at  the  right  of  its  number.  In  the  same  manner,  indicate  your  third,  fourth  and 
last  choices  by  blackening  under  C  D,  and  E.  Do  not  give  any  two  of  the  five  the 
same  rank. 

96.  Scout  observation  planes. 

97.  Light  bombers. 

98.  Pursuit  planes. 

99.  Pilot  in  command  of  large  bomber.  ' 

100.  Test  pilot 

Section  VIII 

A.  Strongly  agree. 

B.  Agree. 

C  Undecided. 

D.  Disagree. 

E.  Strongly  disagree. 

117.  Almost  any  normal  young  man  can  lear;  to  fly. 

118.  Most  airplane  accidents  could  undoubtedly  be  prevented  if  certain  inher- 

ent  structural  defects  of  planes  were  removed. 

Statistical  results.  Validation  data  are  available  for  cases  tested  in 
January  1942  at  Psychological  Research  Unit  No.  1. 

(1)  Test  validity. — The  answer  sheets  of  $74  graduates  and  304 
eliminecs  from  primary  pilot  training  (classes  42H,  421,  and  42J)  were 
divided  into  two  equal  groups.  A  key  was  prepared  on  the  basis  of  each 
group,  weighting  items  that  differentiated  (simple  difference  in  percent- 
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ages)  at  the  1  percent  level  of  significance.  Cases  with  previous  flying 
experience  were  omitted.  The  key  derived  from  each  group  was  validated 
against  primary  graduation  elimination  in  the  other  group.  'Hie  validity 
coefficients  (biscrials)  were  0.43  and  0.36. 

(2)  Item  validity. — The  validity  of  responses  in  this  test  is  indicated 
by  a  range  of  phis  from  —0.24  to  +0.30,  with  a  standard  deviation  of 
0.06.  A  significant  negative  phi,  of  course,  is  just  as  important  as  a 
positive  one  of  comparable  size.  The  data  are  based  upon  the  responses 
of  the  above-mentioned  sample  of  574  graduates  and  304  cliininccs  from 
primary  pilot  training. 

Evaluation. — Validity  statistics  computed  for  Biographical  Data, 
CE602A,  indicated  that  the  hypothesis  that  aptitude  for  flying  would  be 
predicted  by  scoring  certain  biographical  information  was  justified.  The 
extent  of  the  validity  that  could  be  expected  consistently  was  not  yet 
definitely  ascertained,  but  the  approach  was  most  promising.  The  next 
step  was  to  revise  and  to  construct  new  items  on  the  basis  of  the  item 
analysis  of  CE602A. 

Biographical  Data  Blank,  CE602B  * 

This  is  the  second  form  of  the  biographical  data  blank. 

Description. — There  is  a  considerable  difference  in  the  composition  of 
forms  CE  602B  and  CE602A.  Form  A  contains  a  mixture  of  items,  in¬ 
cluding  biographical  questions  of  fact,  and  also  questions  involving  inter¬ 
ests  and  preferences.  Form  B  contains  only  biographical  questions  of  fact. 
The  emphasis  is  on  “What  have  you  done?”  rather  than,  “What  do  you 
like?”  There  were  misgivings  concerning  the  use  of  subjective  judgments 
as  in  parts  of  CE602A,  in  which  intentional  or  unintentional  misrepresen¬ 
tation  is  possible.  It  was  not  claimed  that  the  strictly  biographical  type  of 
question  is  completely  free  from  the  possibility  of  falsification,  but  it  was 
felt  that  factual  statements  are  less  likely  to  be  falsified  than  statements 
of  opinion.  They  were  also  believed  to  be  more  answerable,  in  that  an 
individual  knows  what  events  or  conditions  occurred  within  his  experi¬ 
ence,  whereas  he  has  not  formed  definite  opinions  and  attitudes  in  all  areas 
of  life  or  has  not  verbalized  them  sufficiently  to  report  them  in  a  precise 
manner. 

(1)  Internal  characteristics.— This  form  consists  of  125  items.  The 
categories  covered  include  origin  and  personal  history  of  parents,  early 
home  environment  of  examinee,  attitude  of  parents  toward  examinee  s 
Air  Corps  career,  subjects  studied  in  school  and  proficiency  attained, 
hobbies,  proficiency  in  athletics,  participation  in  mechanical  and  literal 
activities,  occupational  experience,  and  military  experience. 

(2)  Administration. — This  form  is  administered  in  the  same  manner 
as  CE602A.  The  total  time  required  is  approximately  40  minutes. 

(3)  Scoring. — In  April  1942,  fonn  CE  602B  was  administered  experi- 
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mentally  to  a  large  number  of  studeir  .long  with  the  classification  bat¬ 
tery  at  Psychological  Research  I 'nit  No.  3,  Santa  Ana.  loiter,  1,882  of 
these  examinees  graduated  fiom  primary  pilot  training,  while  735  were 
eliminated  (classes  421,  42J,  42K,  and  43A).  Am  item-validation  study 
was  made  and  a  pilot  key  constructed  on  the  ;  asis  of  correlations  of 
responses  with  the  pass-fail  criterion.  This  key  covered  51  responses, 
22  of  which  were  weighted  plus  1  and  29  minus  1,  the  algebraic  sign 
lx-ing  consistent  with  that  of  the  phi  correlation. 

Tlie  same  blank,  CH  60215,  was  administered  with  the  classification  bat¬ 
tery  to  30-12  students  at  Psychological  Research  Unit  No.  1,  Maxwell 
Field,  during  the  jK-riod  Feb.  12  through  Apr.  2,  1942.  Of  these,  598  who 
entered  pilot  training  were  used  in  validation  studies.  The  biographical 
data  blank  was  not  referred  to  as  an  experimental  test,  nor  treated  in  any 
other  way  that  would  distinguish  it  from  the  classification  tests  either  at 
Santa  Ana  or  at  Maxwell  Field.  The  validation  of  the  blank  with  these 
samples,  therefore,  represents  results  that  might  be  expected  if  the  blank 
were  administered  subsequently  for  classification  purposes. 

The  key  furnished  by  Psychological  Research  Unit  No.  3  yielded  a 
biserial  of  0.34  for  the  598  pilot  students  tested  by  Psychological  Research 
Unit  No.  1.  (Sec  table  27.1.)  The  Santa  Ana  key  was  then  augmented 
by  including  ail  items  (except  those  dealing  directly  with  actual  flying 
experiences)  that  showed  a  difference  of  5.6  between  the  percentage  of 
rcsjionscs  for  the  graduates  and  climinees  based  on  the  Psychological 
Research  Unit  No.  3  tabulations.  For  N’s  of  the  magnitude  involved  in 
this  study,  a  difference  of  5.6  was  significant  at  the  1  percent  level  for  a 
response  chosen  by  half  the  group.  For  items  chosen  by  any  other  pro¬ 
portion  of  the  group,  a  difference  of  5.6  had  a  higher  level  of  significance. 
On  this  basis  21  responses  were  added  *o  the  key,  and  3  were  deleted, 
making  a  total  of  69  responses,  37  weighted  plus  1  and  52,  minus  1. 
The  augmented  key  yielded  a  biscrial  r  of  0.42  for  the  same  group  of 
598  pilot  students.  (See  table  27.1.) 

Statistical  results.  (1)  Distribution  statistics. — The  distribution  of 
scores  in  this  lest  is  indicated  by  a  mean  score  of  —2  0  and  a  standard 
deviation  of  5.2.  The  scoring  formula  is  R  — W,  in  which  R  and  W 
stand  for  the  positively  and  negatively  weighted  responses,  respectively. 
These  data  are  based  on  '.tic  above-mentioned  sample  of  598  classified 
pilots  scored  with  the  Psychological  Research  Unit  No.  3  key,  which 
contained  22  positive  and  29  negative  responses.  The  same  598  classified 
pilots  were  scored  with  the  Psychological  Research  Unit  No.  1  aug¬ 
mented  key  which  contained  37  positive  and  32  negative  responses.  The 
mean  score  was  3.7  and  the  standard  deviation  7.0. 

(2)  Test  validity. — The  validity  data  are  summarized  in  table  27.1. 
These  include  results  from  two  other  keys  developed  at  Psychological 
Research  Unit  No  1  and  at  Headquarters,  Army  Air  Forces,  based  on 
all  available  data. 
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Table  27.1.—  Validity  of  Biographical  Data ,  CE602B,  for  primary  pilot  training, 
gradual  ion- elimination  criterion 


Key 

N, 

t , 

1  M* 

**. 

SD, 

PRU  No.  2  (St  acored  responses)  .. 

*598 

0.70 

m 

-2.94 

an 

PRU  No.  2  (69  (cored  responses)  .. 

•598 

.70 

Hilt 

0.46 

PRU  No.  1  <47  (cored  responses)  . . 

*900 

.62 

2.00 

ml 

AFTAS*  (102  scored  response*)  .... 

*900 

.62 

Mm 

4.46 

0.24 

.42 

.22 

.20 


■The  Air  Surgeon's  Office.  Headquarters,  Army  Air  Farce*. 

*  Same  sample,  #*<icd  at  Psychological  Research  Unit  No.  1  fro»  FeK  12  to  Apr.  2»  1*4£ 

'  Same  sample,  tt-ted  at  Psychological  Research  Unit  No.  t  in  September  If 42.  In  class  4 IF. 

(3)  Item  validity. — The  data  arc  summarized  in  table  27.2. 


Tabu  27.2. —  Item  vclidity  data  for  Biographical  Data,  CE602B,  based  on  1J&2 
graduates  and  735  eliminees  from  primary  pilot  training* 


Croup  of  items 

M# 

SD# 

Ranft 

of  # 

Lost 

High 

0.09 

-.11 

0.02 

0.02 

0.12 

.02 

-.10 

-.01 

1  la  clttMl  421,  42J,  42K,  »nd  4 JA.  Tested  at 

Psychological  Research  Unit  ] 

No.  2  in  April 

1042. 

Evaluation. — The  predictive  value  of  the  biographical-information 
items  was  sustained  in  the  second  form  of  the  Blank,  CE602B.  An  aug¬ 
mented  key  based  on  69  responses  yielded  a  pilot  validity  of  0.42.  In  the 
preparation  of  this  form,  a!l  items  of  opinion  were  eliminated,  leaving 
questions  of  fact  only.  This  apparently  did  not  alter  the  test  validity 
significantly. 

Variations  of  the  test. — Two  variations  of  this  form  are  worthy  of 


mention — CE602B-SA  and  CE602SAB. 

(1)  Form  CE602B-SA.* — This  form  is  a  version  of  CE602B  pre¬ 
pared  especially  for  experimental  administration  in  January  1943  to  pilots. 
It  contains  1 15  items,  including  20  questions  (part  II)  asking  for  personal 
opinions,  similar  to  the  final  section  in  the  first  Blank,  CE602A.  The  scor¬ 
ing  formula  is  i\- W+20.  It  was  administered  at  Psychological  Research 
Unit  No.  3  in  an  experimental  battery  just  before  the  students  left  pre- 
flight  school  for  primary  school. 

Statistical  results.  (1)  Distribution  statistics— In  table  27.3  are  pre¬ 
sented  distribution  constants,  using  three  keys. 

Table  27.3.—  Distribution  data  for  samples  of  pilots  scored  u-ith  different  keys  on 

Biographical  Data,  CE602B-SA 


Part 

N 

M 

l  . 

>154 

20.2 

ii  . 

‘*S* 

20.6 

Total . 

•1.645 

19.2 

•  do  . 

•1.617 

..  i  . - T* 

19.6 

Original 

I>.  . 

Stanin*  !«»' 
Validity  key* 


SD 


M 

5.1 

2.2 
4.0 


»  la  ike  (tan me  key.  items  *rr«  »en;»tr<J  to  masimirc  formation  wnn  damn,  wax  i. 

BI„  correlation  witk  th«  graduation-elimination  criterion.  Tke  raiidity  k»y  «l  conatruetod  to 
•cromplitk  tkr  rcttftc  function.  Sot  alt#  W<  — 

*  In  claaa  4JH. 

•  In  claaa  42J. 


Cot 


•DtTtloprd  at  Psyckological  Rrsrartk  Unit  No. .  J.  Ckicf  contributi 
»L  J.  P.  Cuuford,  Cap*.  L.  C.  Humpknys.  and  klaj.  MtmB  f.  Rof 


Ckicf  contributor*:  Capa.  S.  w.  Cook, 
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(2)  Test  validity. — Validation  results  arc  summarized  in  table  27.4. 

Tam.R  27.4. —  Validity  data  for  Biographical  Data  Blank f  CE602B-SA  for  pilot 
training,  graduation-elimination  criterion 


Croup 

- } - 

I*art  or  key 

1  . . 

II  . 

t  . 

(I  . 

Stanine  key  . 
Validity  key  . 

N, 

f. 

Af. 

M. 

SD, 

fM* 

f  1 

<rlla 

In  primary  training  .... 

Do  . . . 

Through  basic  training  . 

Do  . 

In  primary  training  . ... 
Do  . . . 

•856 

•856 

•856 

•856 

•1,6-15 

•1,617 

0.75 

.75 

.67 

.67 

.62 

.62 

21.75 

20.92 

22.05 

21.04 

19.31 

20.62 

16.20 

19.84 

16.95 

19.86 
19.0S 

17.86 

7.75 

5.10 

7.75 

5.10 

3.17 

4.64 

0.42 

.13 

.40 

.14 

.05 

.37 

0.4S 
.13 
.43 
.15 
•  •  »  • 

•  all 

*  Assuming  »n  unrestricted  stanine  standard  deviation  of  2.00. 

*  In  class  43J. 

•In  class  43H, 

(3)  Item  'validity. — Item  validity  data  for  2,500  pilots  in  primary 
training  are  shown  »n  table  27.5.  The  proportion  of  this  sample  that 
graduated  is  unknown. 


Tabu  27.5. —  Item  validity  data  based  on  2,500  pilots  in  primary  training? 


Group  of  responses 

M* 

SDd 

Range  of  # 

Low 

High 

Responses  keyed  +  1 . . 

0  04 

0.04 

-0.06 

+0.19 

Responses  keyed  -  1  . . 

-.05 

.06 

-.18 

+.13 

•In  class  4311. 


Evaluation. — Form  CE602B-SA  revealed  the  relative  lack  of  predic¬ 
tiveness  of  the  opinion  items,  as  indicated  by  biserials  of  0.13  and  0.15, 
as  compared  with  0.45  and  0.43  for  the  factual  items.  Tests  with  as  low 
validity  as  for  the  opinion  items  often  add  something  to  prediction,  how¬ 
ever,  particularly  if  their  contributions  are  unique.  Since  this  seemed  a 
distinct  possibility  in  the  case  of  these  items,  further  attention  was  given 
to  them  in  an  enlarged  test  known  as  Survey  of  Aviator  Opinion, 
CE604A  (see  ch.  25). 

(2)  Form  CE602SAB.1— This  form  contains  22  items,  divided  into  two 
parts.  Part  I  (Biographical)  consists  of  12  items,  and  part  II  (Opinions 
and  Interests),  of  10.  All  examinees  were  allowed  to  finish,  which  re- 
qu:rcd  approximately  fyl  minutes.  Most  of  the  items  were  based  on  ideas 
presumed  to  be  valid  by  nonpsychologica!  personnel  in  the  classification 
section  of  the  Santa  Ana  Army  Air  Base  Classification  Center.  These  ob¬ 
servers  believed  that  certain  answers  to  the  included  questions  would 
indicate  a  favorable  background  for  flying  success.  The  questions  were 
concerned  primarily  with  military  experience,  before  and  after  entering 
the  Air  Forces.  Following  is  a  typical  question : 

How  much  military  experience  had  you  had  before  entering  the  Air  Forces? 

A.  Member  of  an  organization  that  has  been  overseas  under  fire. 

B.  Member  of  a'  organization  that  has  been  overseas  but  not  under  fire. 

C.  Member  of  an  organization  that  had  been  alerted  for  overseas  duty. 

*  Developed  u  Psychological  Research  Unit  No.  3.  Chief  contributor!  Lc.  David  H.  Jenkins. 
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D.  Was  in  training  or  on  duty  in  the  United  States  only. 

E.  Came  directly  into  Air  Fortes  after  induction. 

Part  II  asked  questions  of  attitude  and  interest,  but  this  part  differed 
from  the  interest  sections  in  the  previous  blanks  in  that  it  was  directed 
almost  wholly  towards  aviation  and  military  interest 

To  what  extent  do  you  like  to  work  with  motors  and  engines? 

A.  Prefer  it  to  most  other  things. 

B.  Like  it  as  much  as  most  other  things. 

C  Like  it  less  than  most  other  things. 

T>.  '  Do  not  like  it  at  all 

Statistical  results. — Biographical  Data,  CE602SAB,  was  administered 
at  Psychological  Research  Unit  No.  3  in  November  1943  to  1,700  pilots, 
and  the  items  were  validated  against  graduation-elimination. 


(1)  Item  validity. — The  distribution  of  phis  is  described  in  table  27.6. 

Table  27.6. —  Item  validity  for  Biographical  Data,  CE602SAB,  for  pilots  is 
primary  training /  graduation-elimination  criterion 


Group 

N, 

u 

SD# 

Rang*  of  # 

Pilots  In  primary  training  ........ 

67S 

0.89 

0.07 

-0.ll-0.t4 

Pilots  through  advanced*  . 

638 

.77 

J97 

—.IS— .18 

*  an  vtasm  ■sir  auu  nu. 

1  Bdow-average  student*  (who  eventually  graduated)  were  counted  **  ttlnuecs. 

Evaluation. — As  mentioned  previously,  the  22  items  in  Biographical 
Data  Blank,  CE602SAB,  were  based  on  opinions  of  classification-section  • 
personnel  as  to  what  biographical  information  would  be  predictive. 
Statistics  are  not  complete,  but  the  item-validity  data  indicate  a  general 
lack  of  predictive  value  of  the  items  as  compared  with  previously 
validated  items. 


Biographical  Data  Blank,  CE602D  r 

This  is  the  only  form  of  the  Biographical  Data  Blank  that  has  been  in 
the  classification  battery,  which  it  entered  in  July  1943.  It  is  based  upon 
items  tried  out  experimentally  in  the  previous  forms.  Sixty-five  of  the 
items  from  form  CE602B  that  showed  empirical  validity  for  pilot  or 
navigator  prediction  were  incorporated  into  CE602C.  CE602D,  the  bat¬ 
tery  form,  contains  the  same  65  items  more  conveniently  arranged  for 
administration. 

Description.  (1)  Internal  characteristics.— The  items  of  form 
CE602D  are  distributed  among  the  following  categories:  Origin  of 
parents  and  early  home  environment,  subjects  studied  in  school  and 
proficiency  attained,  proficiency  in  athletics,  relative  frequency  of  per¬ 
formance  of  mechanical  and  academic  activities,  hobbies,  occupational 
experience,  and  aviation- '.raining  interest. 


c  a  nonnsychological  * gene/  whit'i  made  final  decisions  concerning  classification,  .  .  ,  . 
» Developed  by  cooperative  effora  of  Psychological  Research  Unit  No.  1  and  Psychological 
Research  Unit  No.  1. 
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(2)  Administration.—  ‘  incc  it  is  desired  that  all  examinees  finish,  the 
time  limit  is  generously  .  t  at  25  minutes. 

(3)  An  experiment  /■  administration. — To  determine  the  probable  de¬ 
gree  of  falsification  an its  effect  upon  pilot  validity,  form  CE602D  was 
administered  to  cquiv .  ent  groups  of  students  under  three  sets  of  in¬ 
structions  designed  to  >roduce  different  degrees  of  motivation  to  falsify 
or  not  to  falsify.  One  jet  of  instructions  was  framed  with  an  attempt  to 
minimize  the  pressure  against  falsification,  a  second  set  to  maximize 
such  pressure;  and  a  third  set  was  identical  with  the  standard  instruc¬ 
tions  for  the  test 

Following  are  the  salient  sentences  from  the  first  set  of  instructions,  » 
in  which  minimal  pressure  against  falsification  is  exercised:  ; 

Read  each  question  and  the  answers  that  follow  it.  Select  the  statement  that  best 
answers  the  question  for  you.  Do  not  try  to  see  all  possible  interpretations  of  a 
question  or  answer;  these  are  meant  to  be  straightforward  questions  with  simple  ) 

answers.  Use  approximate  answers  when  in  doubt.  Do  not  quibble  over  small  inac-  >' 

curacies.  Remember  that  what  counts  is  the  general  impression  made  by  your 
answers  together.  No  single  response  can  decide  your  future. 

Following  arc  the  pertinent  portions  of  the  second  set  of  instructions,  1 
in  which  maximal  pressure  against  falsification  is  exercised:  1 

As  stated  in  Army  Regulations  380-5,  the  Army  Air  forces  attaches  great  signifi-  ; 

cance  to  the  response  you  are  about  to  make  Read  each  question  and  the  answers 
that  follow  it.  You  must  be  certain  to  select  the  statement  that  best  answers  the 
question  for  you.  Do  not  attempt  to  answer  with  the  choices  which  you  think  will 
result  in  your  being  given  the  particular  assignment  you  want. 

Your  signature  attests  to  the  truthfulness  of  your  answers.  If  upon  checking  your  j 

references,  we  find  that  you  have  committed  perjury,  your  entire  future  in  the  Air  i 

Forces  will  be  endangered.  Tests  such  as  this  are  frequently  used  to  detect  per¬ 
jurers,  since  it  is  possible  to  check  on  the  truthfulness  of  your  answers  from  other  1 

sources.  The  seriousness  of  fraud  and  perjury  in  the  Army  is  set  forth  in  the  j 

fifty-fourth,  ninety-third,  and  ninety-sixth  Articles  of  War.  The  fifty-fourth  Article  ! 

of  War  states  specifically,  "Any  person  who  shall  procure  himself  to  be  enlisted  in  , 

the  military  services  of  the  United  States  by  means  of  willful  misrepresentation  or 
concealment  as  to  his  qualifications  of  enlistment,  and  shall  receive  pay  or  allow¬ 
ance  under  such  enlistment,  shall  be  punished  as  a  Court  Martial  may  direct  *  *  *"  ! 

Work  as  rapidly  as  you  can  without  being  careless.  Do  not  think  a  long  time 
about  questions  that  call  for  simple  facts,  but  mark  quickly  what  is  true  in  your 
case.  Everyone  should  finish  the  blank  and  answer  all  the  questions.  Omit  none. 

You  have  25  minutes.  Remember,  you  are  subject  to  military  law.  Answer  with 
strict  honesty. 

Following  arc  the  sections  from  the  standard  instructions  that  demon¬ 
strate  intermediate  pressure  .against  falsification: 

In  this  blank  you  are  asked  for  certain  information  about  your  background,  your 
family,  your  home,  your  education,  your  hobbies,  and  your  civilian  employment 
This  is  not  a  test.  There  are  no  "right"  answers  except  those  that  tell  the  truth 
about  yourself.  To  a  large  extent,  your  success  in  air-crew  training  will  depend  on 
how  well  you  are  understood  by  those  in  charge.  All  of  the  information  asked  for 
in  this  blank  has  been  shown  to  be  related  to  air-crew  training.  It  is  therefore  to 


778 


i 


your  own  interest  to  fill  out  this  blank  carefully  and  completely.  You  will  record 
your  answers  on  the  separate  answer  sheet. 

Read  each  Question  and  the  answers  that  follow  it  Select  the  statement  that  best 
answers  the  question  for  you.  Sometimes,  no  one  of  the  answers  will  fit  your  case 
exactly.  Do  not  worry  about  this,  but  select  the  answer  that  most  nearly  fits. 

Approximately  600  to  900  students  were  tested  at  Psychological  Re¬ 
search  Unit  No.  1  during  May  1943  under  each  set  of  instructions  with 
the  results  shown  in  table  27.7.  Tables  27.8  and  27.9  show  the  critical 
ratios  of  the  differences  in  means  and  in  validity  coefficients. 


Table  27.7. —  Validity  data  for  Biographical  Data  Blank,  CE602D,  under  three 
different  instructions  based  upon  pilots  in  primary  training  and  the  graduation- 

elimination  criterion 


Condition 

*• 

M. 

M. 

SO, 

rM# 

914 

0.84 

27.6 

24.3 

’7.1 

6.67 

0.26 

912 

.80 

2S.9 

21.2 

23.0 

6.73 

.38 

661 

.79 

26.9 

22.1 

26.0 

6.89 

.40 

1  M,=Mean  of  total  group. 


Tabus  27.8. — Critical  ratios  of  the  differences  between  means  of  the. total  groups  foe 
Biographical  Data  Blank,  CE602D,  under  three  different  instructions 


Condition 

1 

a 

2 

6.72 

1.28 

2.  Maximum  pressure  for  honesty  ... 

6.72 

3.28 

279 

2.79 

• 

Table  27.9.—  Critical  ratios  of  the  diffetenees  between  validity  coefficients  for 


1 

2 

2.40 

2.40 

2.66 

.46 

Condition 


1  Minimum  pressure  for  honesty 

2  Maximum  pressure  for  honesty  . 

)  Standard  instructions  . 


2.60 

.46 


It  can  be  seen  that  the  biserial  correlation  for  the  Biographical  Data 
Blank,  CE602D,  is  highest  (0.40)  when  administered  under  standard 
instructions.  It  was  concluded  that,  although  varied  instructions  do  have 
apparent  effects  upon  scores  for  this  test,  the  differences  are  not  much  as 
to  impair  the  effectiveness  of  the  blank  under  standard  conditions.  Con¬ 
sequently,  the  standard  instructions  were  retained  in  the  administration  of 
ihe  test  as  part  of  the  classification  battery.  It  is  most  reassuring  to  note 
that  the  strongest  possible  pressure  for  honesty  did  not  improve  the 
validity  ot  the  test.  Also  noteworthy  is  the  fact  that  encouragement  of 
laxitity  in  responding  seemingly  lowered  the  test  validity. 

(4)  Scoring. — Biographical  Data  Blank,  CE602D,  is  scored  in  two 
ways— with  a  pilot  key  and  a  navigator  key.  The  pilot  key  is  based  on 
item  validities  determined  in  a  sample  of  1,882  primary  pilot  graduates 
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and  735  diminces  (classes  42.,  42J,  42K  and  43A)  tested  by  Psychologi¬ 
cal  Research  Unit  No.  3  an;  also  on  420  graduates  and  176  eliminees 
tested  by  Psychological  Rest  irch  Unit  No.  1  during  February  to  April 
1942.  AH  eases  were  obtaine.  from  administration  of  Biographical  Data 
Blank,  CE602B.  The  development  of  this  key  is  described  in  the  de¬ 
scription  of  the  CE602B  form.  The  navigator  key  was  also  based  on  this 
form,  administered  in  January  1942  by  Psychological  Research  Unit 
No.  1  to  312  examinees,  including  270  graduates  from  navigation  training 
and  42  eliminees.  Both  pilot  and  navigator  keys  include  positively  and 
negatively  weighted  responses.  The  number  of  responses  receiving 
weights  (cither  plus  or  minus)  is  100  for  pilot  and  40  for  navigator. 

•  The  scoring  formula  in  each  case  is  R— W+20,  where  R  refers  to  posi¬ 
tively  weighted  and  W  to  negatively  weighted  responses. 

A  special  item-validity  study. — A  study  was  initiated  at  Psychological 
Research  Unit  No.  3  of  the  items  in  Biographical  Data  Blank,  CE602B- 
SA  and  CE602D,  to  determine  new  weights  for  the  items  which  might 
increase  the  predictive  value  of  the  stanine.  Several  hypotheses  were 
guiding  principles  in  this  connection : 

cl  Items  that  arc  significantly  related  to  graduation-elimination  but 
unrelated  to  the  stanine  should  make  unique  contributions  to  the  stanine 
and  so  receive  substantial  weights. 

b.  Items  that  are  significantly  related  to  the  stanine,  but  unrelated  to 
graduation-elimination,  should  have  negatively  weights  in  the  stanine, 
acting  as  suppression  variables. 

@  c.  Certain  items  may  add  most  to  the  stanine  if  weighted  one  way  for 
cadets  who  have  high  stanincs  and  another  way  for  cadets  who  have  low 
stanines. 

Using  answer  sheets  for  Biographical  Data  Blank,  CE602B-SA,  com¬ 
pleted  by  pilot  students  in  class  43-H  (tested  at  Psychological  Research 
Unit  No.  3),  phi  coefficients  were  computed  for  each  alternative  response 
for  the  following  comparisons:  (1)  660  graduates  and  120  eliminees 
with  high  (7,  8,  and  9)  pilot  stanincs;  (2)  148  graduates  and  248 
eliminees  with  low  (1,2,  and  3)  pilot  stanincs;  (3)  high-stanine  and  low- 
stanine  graduates;  and  (4)  high-stanine  and  low-stanine  eliminees.  Vari¬ 
ous  combinations  of  these  groups  gave  the  correlation  tables  desired.  The 
Biographical  Data  Blank  was  not  in  the  classification  battery  at  the  time 
the  pilots  in  this  study  were  classified. 

Signficant  results  appear  when  the  items  in  the  blank  arc  considered 
in  relation  to  graduation-elimination  and  to  the  stanine.  The  76  responses 
that  have  significant  phis  with  graduation-elimination,  but  rero  phis 
or  phis  of  opposite  sign  with  the  stanine,  were  weighted  plus  1  or 
minus  1  to  give  the  validity  key  for  score  A.  The  92  responses  that  have 
significant  phis  with  the  stanine,  but  zero  phis  or  phis  of  opposite  sign 
with  graduation-elimination,  were  weighted  plus  1  or  minus  1  to  give 
the  stanine  key  for  score  B. 
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A  sample  of  all  papers  available  was  scorc<l  on  these  two  keys,  exclud¬ 
ing  the  papers  used  in  the  original  analysis.* 

Table  27,10  shows  the  correlations  with  the  stanine  and  with  the 
graduation-elimination  criterion  of  scores  A  and  B. 


Table  27.10. —  Validation  data  for  Biographical  Data  Blank,  CE602B-SA,  for  Keys 
Which  Emphasise  Correlation  u-ith  the  Criterion  (Key  A),  and  Correlation  with 
the  Stanine  (Key  B),  Using  Pilots  in  Primary  Training  and  the  Graduation' 

Elimination  Criterion 


[N,-S51f  p,—0.61) 


Key 

SD, 

.rH.* 

«r» i«* 

%* 

A 

20.74 

18.24 

4.S0 

0.JJ 

0.30 

-0.CI 

-0.01 

0.09 

B 

19.06 

19.S1 

2.00 

-.0* 

.09 

.12 

.30 

-.005 

1  Ulinf  an  If  Sait  with  fUmncs  e  t  4,  S,  and  6.  L  tk«K  not  u*«4  in  Ike  ilta-nliAtf  itudy. 
*  Auuming  an  unrestricted  alanine  ttandard  deviation  at  2.00. 

*Ttie  amount  of  validity  that  the  Korc  would  add  to  the  validity  of  the  pilot  ataniao  If  the 
teat  were  added  to  the  Oaaai&catioa  Battery  of  August  1942. 


It  can  be  seen  that  score  A  would  make  a  significant  addition  to  the 
stanine  in  use  for  these  students,  but  score  B  would  not.  The  failure  of 
score  B  to  add  anything  is  partly  a  result,  however,  of  the  selection  of 
items.  Items  having  significant  relationships  both  to  the  stanine  and  to 
graduation-elimination,  but  of  opposite  sign,  logically  might  have  been 
included  in  either  score.  To  avoid  duplication,  these  items  were  weighted 
only  in  score  A.  If,  instead,  they  had  been  weighted  in  score  B,  this  score 
would  probably  have  approached  a  significant  addition  to  the  stanine.  - 

Th '  correlation  between  scores  A  and  B  is  —0.14,  both  raw  and  cor¬ 
rected.  Thus,  though  the  validity  of  score  B  is  low,  it  is  measuring  some¬ 
thing  valid  not  included  in  score  A.  A  positive  correlation  between  the 
two  scores  would  have  been  a  more  favorable  condition  for  the  use  of 
one  as  a  suppression  variable. 

An  examination  was  made  of  the  responses  in  CE602B-SA  that  dis¬ 
criminate  significantly  in  one  direction  between  graduates  and  eliminees 
at  one  stanine  level  but  fail  to  discriminate  or  discriminate  in  an  oppo¬ 
site  direction  at  another  stanine  level.  No  consistent  trend  appeared  in 
the  analysis  of  the  25  items  in  this  category.  The  items  rome  from  all 
sections  of  the  blank.  The  only  general  finding  in  part  I  of  the  blank 
(factual  items)  appears  to  be  a  slight  indication  that  the  low-stanine 
graduates  are  men  who  have  not  had  experience  in  certain  areas  (such 
as  mechanical)  which  arc  heavily  weighted  in  the  stanine,  but  who 
may  have  been  potentially  able  in  those  areas.  The  opinion  items  that 
discriminate  differentially  also  seem  to  represent  no  significant  pattern. 
In  either  case,  the  number  of  items  manifesting  this  type  of  discrimi¬ 
nation  is  so  small  that  chance  could  have  produced  the  relationships  in 
question. 

In  conclusion,  this  study  showed  that  it  seems  doubtful  that  biographi¬ 
cal  items  discriminate  differentially  between  graduates  and  eliminees  at 

» T>-  im?y~Mt-v  is  tk«  Itro  «n«Jy»ct  were  ik«n  yitol  tuaiui  *t  4,  J.  u4  4. 
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different  staninc  levels  to  a  significant  degree.  Items  that  are  related  to 
graduation-elimination,  hut  unrelated  or  oppositely  related  to  the  sta- 
»im\  add  very  significantly  to  the  predictive  value  of  the  stanine.  Items 
that  arc  related  to  the  stanine  hut  unrelated  to  graduation-elimination 
do  not  add  to  the  stanine  prediction,  * 

Since  a  valid  key  independent  of  the  stanine  had  been  prepared  suc¬ 
cessfully,  it  was  decided  to  try  the  same  procedure  for  form  CE602D. 
Utilizing  60(  graduates  and  600  dimmers  (classes  44-D  through 
44-H)  from  primary  training,  who  had  been  tested  with  the  July  1943 
battery,  two  keys  were  again  prepared,  one  to  maximize  validity,  and 
one  to  maximize  correlation  with  stanine.  A  new  sample  of  2,000  eases 
from  classes  44-1  and  44-J  (July  and  November  1943  batteries)  was 
utilized  for  validation  of  these  keys.  Th;  results  are  shown  in  table  27.11. 

Table  27.11. —  Validation  data  for  Biographical  Data  Blank,  CE602D,  for  the 
classification  battery  key  and  for  keys  vjhich  emphasis  !  correlation  with  the 
criterion  (key  A)  and  correlation  with  the  stanine  (key  B),  using 
pilots  in  primary  training  and  the  graduc  ^ion-elimination  criterion 
[N,~2fi00,  p,—0.86] 


Key 

M. 

SD, 

rnt 

¥  S 
*r«f« 

r>* 

f  ; 

Classification  . 

30.10 

27.28 

6.27 

0.24 

0.32 

0.46 

o.ss 

A  . 

26.14 

24.48 

.22 

.26 

.27 

.33 

o.6is 

i 

B  . 

24.44 

23.32 

4.V4 

.12 

.18 

.34 

.4! 

.000 

• 

*  In  diMti  441  and  44J. 

1  Assuming  an  unrestricted  stanine  standard  deviation  of  1.83. 

■  The  amount  of  validity  the  score  would  add  to  the  validity  of  the  pilot  stanine. 


It  is  apparent  from  these  data  that  the  procedure  of  selecting  items  to 
correlate  with  the  stanine  and  not  with  the  criterion,  or  vice  versa,  was 
less  successful  with  form  CE602B  and  the  July  and  November  stanines 
than  with  form  CE602B-SA  and  the  August  stanine.  This  is  due  in  part 
to  increased  correlation  with  blank  and  stanine.  It  is  also  due  in  part  to 
the  fact  that  the  biographical-data  score  was  included  in  the  stanine  used 
as  a  criterion  for  item  analysis  The  more  accurate  procedure  of  sub¬ 
tracting  the  biographical-data  score  from  the  stanine  before  doing  the 
item  analyses  and  correlations  with  stanine  did  not  seem  worth  the  con¬ 
siderable  labor  entailed  at  the  time  the  study  was  undertaken.  It  was 
lielicved  that  the  unique  contribution  of  Biographic.il  Data  to  stanine 
validity  was  larger  than  is  now  known  to  be  true. 

In  summary  statement  we  may  say:  (a)  The  procedure  used  in  these 
studies  gave  promising  results  with  the  first  blank  and  stanine,  in  that 
one  group  of  items  related  to  graduation-elimination  but  unrelated  to 
the  stanine  added  a  significant  amount  to  the  validity  of  the  stanine; 
( b )  when  this  procedure  was  tested  with  a  later  version  of  the  blank, 
the  amount  of  additional  validity  decreased  markedly;  and  (r)  signifi¬ 
cant  negative  contributions  to  stanine,  using  keys  which  maximised 
correlation  with  stanine  and  minimized  correlation  with  criterion,  were 
not  found  for  cither  blank. 
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Statistical  results.  (1)  Distribution  statistics. — Score  distributions 
for  form  I)  arc  described  by  the  data  in  table  27.12. 


Table  27.12. —  Distribution  statistics  for  pilot  and  naiigntor 

Data,  CEM2D 


*7  ^"7*  ”1  - - 


Scort 


Pitot  .... 
Do  .. .. 
Do  . . . . 
Do  .... 
Navigator 
Do  .... 
Do  .... 
Do  . . . . 


Croup 


UncluMtrd  Aviation  Students* 

....do*  . . 

....do*  . 

We*t  Point  cadet*4  . 

Unclassified  Aviation  Students1 

....do*  . 

....do*  . 

West  Point  cadets4 . 


M 

SD 

3,000 

22.  S 

6.9 

1,920 

26.1 

6.3 

1,300 

27.2 

4.9 

m 

29.7 

4.9 

3,000 

22.6 

3.2 

1,920 

21.1 

3.0 

1,500 

22.3 

3.0 

m 

24.1 

2.9 

Battery. 

*  Tested  at  Medical  and  Psychological  Examining  Units  Nos.  4  to  10  inclusive  with  the 
November  1943  Classification  Batter*. 

*  Tested  at  Psychological  Research 
fication  Battery. 

4  Class  of  194*. 


Units  Nos.  I,  2,  and  1  with  the  November  194)  dotal 


(2)  Reliability  coefficients. — A  reliability  coefficient  of  0.62  (cor* 
rcctcd)  was  obtained  for  the  pilot  score  with  a  sample  of  1,000  unclas¬ 
sified  students  tested  at  Medical  and  Psychological  Examining  Unit 
No.  7.  For  the  navigator  score  on  the  same  sample,  the  coefficient  was 
0.35  (corrected).  These  coefficients  were  obtained  by  the  split-half 
method.  An  attempt  was  madc*to  choose  items  for  each  part  that  would 
make  the  content  of  the  halves  roughly  comparable.  The  coefficients  ob¬ 
tained  are  much  lower  than  test-retest  coefficients,  which  arc  0.86  and 
0.49  respectively,  for  a  time  interval  of  approximately  28  days  (N  — 
71 1,  on  classified  aviation  students  tested  at  Medical  and  Psychological 
Examining  Unit  No.  6,  11  to  19  April  1945). 

(3)  Factor  composition.— The  classification  form  of  the  Biographical 
Data  Blank,  CE602D,  was  factor  analyzed  with  two  batteries  (sec  eh. 
28  for  a  full  discussion),  the  July  1943  Classification  Battery  and  the 
November  1943  Classification  Battery,  and  for  both  the  navigator  and 
pilot  keys. 

The  navigator  score  in  both  analyses  revealed  only  one  significant 
loading,  which  is  in  the  mathematics-background  factor.  The  loading  in 
the  July  1943  battery  was  0.42,  in  the  November  1943  battery  0.50, 
with  a  weighted  average  of  0.45.  The  commonality  is  quite  low,  as  might 
be  expected  from  the  dearth  of  significant  loadings  in  diverse  factors. 
In  the  July  battery  the  communality  was  0.25,  in  the  November  bat¬ 
tery,  0.30,  with  a  weighted  average  of  0.26. 

For  the  pilot  score  of  the  blank  the  highest  loading  is  in  the  mechani- 
cal-cxpcriencc  factor.  This  loading  was  0.50  in  the  July-battcry  analysis 
and  0.53  in  the  Novembcr-battcry  analysis,  with  a  weighted  average  of 
0.51.  The  next  highest  loading  appeared  in  the  mathematics-background 
factor,  which  for  the  July  battery  was  0.29,  and  for  the  November  bat¬ 
tery  0.31.  with  a  weighted  average  of  0.30.  A  consistent  negative  load- 
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in)'  in  tlic  numerical  factor  appeared:  —0.20  for  the  July  battery,  —0.26 
for  the  .November  battery,  with  a  weighted  average  of  —0.22.  The  coni- 
munality  for  the  pilot  score  is  higher  than  for  the  navigator  score,  being 
0.46  for  the  July  battery,  0,55  for  the  November  battery,  and  0.48  for 
the  weighted  average. 

(4)  Test  validity. — For  validity  of  the  pitot  score  see  table  27.13,  and 
for  the  navigator  score,  table  27.14. 

Evaluation. — This  form  of  the  Biographical  Data  Blank  (CE602D) 
was  refined  to  the  point  where  it  was  acceptable  for  the  classification 
battery.  All  nonprediclivc  items  had  been  dropped,  leaving  only  65.  The 
data  show  that  this  form  is  a  good  predictor  of  both  pilot  and  navigator 
aptitude,  and,  as  such,  is  a  useful  instrument  in  classification  testing. 

Both  scores  arc  valuable  because  of  their  unique  elements.  Mathe¬ 
matical  background  makes  up  about  18  percent  of  the  total  variance  of 
the  navigator  score — more  than  the  test  mathematics  A  has  to  offer  to¬ 
ward  this  factor.  This  score  should  be  made  more  reliable  by  increasing 
the  number  of  keyed  responses,  at  the  same  time  increasing  the  variance 
in  mathematics  background  and  perhaps  adding  other  valid  variance. 

About  26  percent  of  the  variance  in  the  pilot  score  is  in  mechanical 
experience,  which  is  covered  better  by  mechanical  tests.  Its  unique  valid¬ 
ity  is  due  to  an  unknown  factor  or  factors.  This  unknown  variance 
should  be  identified  and  enlarged.  For  the  sake  of  purity  the  mechanical 
variance  should  be  removed.  The  variance  in  mathematical  background 
should  also  be  removed  from  the  pilot  score.  To  it  can  be  attributed  the 
small  validity  of  the  pilot  score  for  navigation  training.  The  purposes  of 
classification  would  be  better  served  if  this  variance  were  confined  to 
the  navigator  score. 

Variations.  (1)  Biographical  Data  Blank,  CE602C. — As  mentioned 
above,  this  form  is  made  up  of  65  items  which  showed  empirical  valid¬ 
ity  for  pilot  or  navigator  prediction.  Since  it  is  exactly  like  CE602D 
except  for  order  of  items,  nothing  further  need  be  said  concerning  it. 

(2)  Form  CE6Q2W. — A  special  form  of  the  Biographical  Data  Blank, 
CE602W,  was  constructed  at  Headquarters,  AAF  Training  Command, 
for  administration  to  WASP  (Women's  Auxiliary  Service  Pilots)  pilot 
trainees.*  While  no  statistical  data  are  available  at  the  time  of  writing, 
a  brief  description  of  the  form  may  be  of  interest.  This  blank  contained 
61  items,  which  were  slanted  for  a  female  population  and  designed  to 
elicit  general  biographical  information,  expressions  of  interest,  and  facts 
of  experience.  The  categories  covered  include  age,  history  of  minor  ill¬ 
ness,  living  experience,  marital  and  maternal  status  or  plans,  social  life 
in  college,  marital  adjustment  of  parents,  use  of  cosmetics,  tobacco,  and 
liquor,  childhood  and  adult  activities  and  interests.  Some  items  were 
suggested  by  the  impressions  gained  by  a  psychiatrist  while  interview¬ 
ing  a  preliminary  group.  He  found  that  a  number  of  health  and  per- 

*  Chief  contributor:  Maj.  R.  L.  Thorndike. 
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sonal  factors,  particularly  relating  to  marriage,  seemed  significant  in  the 
general  adjustment  of  these  trainees.  Following  arc  several  typical  items: 

Do  you  have  headaches? 

A.  Practically  never  have  headaches. 

B.  Have  mild  headaches  occasionally  at  irregular  intervals. 

C.  Have  occasional  severe  headaches. 

D.  Have  headaches  recurring  at  regular  intervals. 

E.  Have  frequently  recurring  severe  headaches. 

If  you  are  married  or  get  married,  what  would  you  consider  the  ideal  sire  of 
family  for  you? 

A.  No  children. 

B.  One  child. 

C  Two  children. 

D.  Three  or  four  children. 

E.  Five  or  more  children. 

While  you  were  in  college  (or  of  college  age),  how  frequently  did  you  have 
“dates  r 

A.  Not  at  alL 

B.  Once  or  twice  a  month. 

C  About  once  a  week. 

D.  Two  or  three  times  a  week. 

E.  Four  times  a  week  or  more. 

Biographical  Data  Blank,  CE602E  *• 

In  an  attempt  to  increase  the  validities  of.  form  CT.602D  and  to  pro¬ 
vide  a  more  adequate  coverage  of  personality  and  background  factors 
related  to  air-crcw  training,  new  items  were  constructed  and  incorpo¬ 
rated  in  CE602E.  These  new  items  were  suggested  by  inspection  of  (1) 
valid  items  in  the  CE602D  version  of  the  blank,  (2)  job-analysis  data, 
(3)  clinical  data,  and  (4)  experimental  personality  inventories. 

Description. — Part  I  of  the  blank  pertains  to  social  background  in 
general,  and  part  II  pertains  to  self-evaluations  of  different  personality 
aspects.  Categories  of  items  in  part  I  include:  early  home  environment, 
hobbies,  athletic  experience,  career  and  occupational  interests,  education 
of  parents,  father's  occupation  and  financial  condition,  intrafamily  re¬ 
lationships,  extent  of  parental  participation  in  examinee's  home  and 
school  life,  marital  interests,  social  habits  and  type's  of  friends,  person¬ 
ality  traits  considered  desirable,  sleep  habits  and  dreams,  opinions 
about  Army  officers,  and  flying-duty  preferences. 

Items  in  part  II  pertain  to  the  following  categories:  vacation  prefer¬ 
ences;  leisure-time  activities  and  satisfactions;. interests  in  l»ooks,  songs, 
and  comics;  opinions  about  typical  social-conllict  situations;  self  evalua¬ 
tion  of  personality  traits;  opinions  about  behavior  of  other  people; 
opinions  about  the  cncm)  am!  opinions  about  generally  accepted  social 
attitudes. 

(1)  Internal  elutrue ter i  tics. — This  form  contains  300  items,  divided 
into  2  parts  of  150  each. 

»!  P»rtkol*(K*l  I  i  l  Unit  So.  I.  Clitof  (oMukHoii:  U  J*kn  S.  HoHint, 
Sec  Uiim  U.  PrtOtnokr,  Set  V  Srtk,  tnd  Cojx  DoniM  L  Sopor. 
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(2)  Administrar  >n. — All  examinees  are  supposed  to  finish  both  parts 
of  the  blank.  This  cquires  approximately  60  minutes  per  part.  Exami¬ 
nees  arc  informed  when  half  the  time  is  up  on  each  blank. 

Following  arc  two  typical  items  from  each  part,  respectively: 

Which  of  the  following  skilled  jobs  would  you  rather  hold? 

A.  Draftsman. 

B.  Carpenter. 

C.  Watchmaker. 

D.  Inspector. 

E.  Tool  and  die  maker. 

How  often  do  you  worry? 

A.  Vet.,  frequently. 

B.  Frequently. 

0.  Occasionally. 

D.  Very  seldom. 

E.  Never. 

The  principal  satisfaction  most  people  get  out  of  participating  in  sports  is  that  of: 

P  Showing  their  skills. 

B.  Bcin<r  with  friends. 

C.  Preserving  health. 

D.  Competing  with  others. 

Ordinarily,  labor  does  not  get  its  fair  share  of  wliat  it  produces. 

A.  Yes. 

B.  No. 

(3)  Scoring.-- It  is  planned  that  new  valid  items  from  test  CE602E 
will  be  inco'  a  ted  with  items  in  CE602D.  The  scoring  formula  will 
be  R-W-K  . 

Statistical  results.  (1)  It*. t»  validity  (pilot). — Form  E  was  admin¬ 
istered  at  Psychological  Research  Unit  N  -.  1  in  April  1944  to  682 
classified  pilots  in  pre-flight  school,  and  the  sample  was  divided  into 
odds  and  evens  groups  for  item  correlation  with  the  graduation-elimi¬ 
nation  criterion  (primary  training)  and  cross-validation  of  total  scores. 
The  proportion  of  graduates  was  0.78.  Responses  with  phis  of  P.ll  or 
greater  were  keyed.  The  distribution  of  phis  is  given  in  table  27.15. 


Taiii-e  27.15. — Frequency  distribution  of  item-validity  phis  of  responses  in 
Biographical  Data,  CE602E,  for  pilot  students 


Odds  sample1 

Evens  sample1 

Phi  raiise 

2-choice 

Mul»iple-ehoice  | 

2-choice 

Multiple-choice 

Part  I 

Part  II 

Part  I 

Part  I 

Part  II 

Part  I 

Part  11 

0  21  to  0.27  . . 

0 

1 

2 

0 

0 

0 

0 

0 

.18  to  .22  ... 

I 

0 

7 

0 

0 

4 

.vHl 

.11  to  .17  ... 

1 

6 

12 

s 

3 

3 

15 

i 

.08  to  .12  ... 

7 

IS 

42 

IS 

4 

27 

36 

14 

.01  to  .07  ... 

7 

54 

68 

18 

17 

SI 

91 

2S 

-.02  to  .02  ... 

12 

18 

100 

26 

9 

28 

95 

25 

-.07  to  -.01  . 

68 

26 

70 

27 

-.12  to  —.08  . 

4| 

12 

10 

—.17  to  -.11  . 

8 

2 

jppEHHI 

iRKf 

2 

—  .22  to  -.18  . 

HWitW 

6 

0 

mmm 

0 

Total*  . . . 

10 

U4 

161 

104 

31 

113 

353 

104 

»  N  =  140. 
*N=j4a. 
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No  items  with  more  extreme  division  than  85 — 15  were  included.  Items 
having  two  alternative  responses  were  segregated  from  those  having 
more  than  two,  and  only  positive  phis  are  given  in  them. 

(2)  Test  validity  (pilot), — Two  scoring  keys  were  derived  from  the 
odds  and  evens,  and  a  cross-validation  study  was  made.  The  data  are 
found  in  table  27.16. 


Table  27.16. —  Validity  data  for  two  empirical  keys  for  Biographical  Data, 
CE602E,  pilot  score  (p,—0J8) 


Key 

Part 

Formula 

N. 

M. 

SD, 

riu 

Odds’ 

I 

Right*  ... 

342 

13.75 

12.68 

3.54 

0.18 

0.22 

...  .do* 

I 

Wrong*  . 

342 

9.18 

9.78 

2.76 

—.13 

-.13 

....do* 

1 

R-W  ... 

342 

4.57 

2.90 

5.78 

.17 

.20 

Evens* 

I 

Rights  ... 

340 

12.03 

10.99 

2.77 

.22 

....do* 

I 

Wrong*  . 

140 

4.56 

5.20 

1.69 

-.22 

-.2* 

....do*  . 

I 

R-W  ... 

340 

7.47 

5.79 

3.86 

.25 

.26 

Odds4 

II 

Rights  ... 

142 

8.27 

8.00 

2.0J 

.00 

—.01 

....do4 

11 

Wrongs  . 

342 

6.20 

6.38 

1.78 

-.06 

—.04 

. . .  .do4 

II 

R-W  ... 

342 

2.07 

1.62 

3.52 

.03 

.01 

Evens* 

II 

Rights  ... 

340 

9.16 

9.04 

1.96 

.03 

.04 

....do* 

11 

Wrongs  . 

340 

7.71 

7.76 

1.88 

-.02 

-.02 

....do* 

II 

R-W  ... 

340 

I.4S 

1.28 

3.67 

JOS 

.03 

*  Assuming  an  unrestricted  stanine  standard  deviation  of  2.00. 

•Number  of  scored  items =59. 

•  Number  of  scored  jtems=41. 

4  Number  of  scored  itcm*=34. 

•Number  of  scored  item»=J6. 

(3)  Item-validity  (navigator). — -Form  E  was  administered  to  897 
(part  I)  and  888  (part  II)  classified  navigation  students.  Testing  was 
accomplished  at  Ellington  Field  and  Sclman  1‘icld  in  February  1944  and 
at  Psychological  Research  Unit  No.  3  in  February  and  March  1944. 
Each  group  of  students  was  subdivided  into  odds  and  evens  samples, 
and  item  correlations  with  the  graduation-elimination  criterion  (ad¬ 
vanced  training)  were  computed.  The  proportion  of  graduates  was  0.89. 
Responses  with  phis  of  0.10  or  greater  were  keyed.  The  distribution  of 
phis  are  given  in  table  27.17. 

Table  27.17.— Frequency  distributions  of  item-validity  phis  of  responses  for 

Biographical  Data  Blank,  CE602E,  for  navigation  students _ 

Odds  sample  Evens  sample _ 

Phi  range  2«ijoiee  j  Multiple-choice  2<boice  )  Multiple-choice 


0.21  to  0.27  . 
.18  to  .22  .. 
.13  lo  .17  .. 
.08  to  .12  .. 
.03  to  .07  .. 
—.02  to  .02  .. 
—.07  to  —.03  , 
—.12  to  —.08 
—.17  to  —.13  . 
—.22  to  —.18  . 
—.27  to  —.23 
Totals  . . . 


2-choice  | 

[  Part  I 

Part  II 

1 

0 

1 

3 

2 

9 

4 

26 

22 

63 

8 

22 

38 

123 

2 -choice 


Multiple-choice 


Part  I  I  Part  II  Part  l  Part  II 


(4)  Test  validity  (navigator).  —Two  scoring  keys  derived  from 
odds  and  evens  groups  were  used  in  the  cross-validation  study.  The  re¬ 
sults  arc  given  in  table  27.18. 


703320 — 47 — $1 
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Table  27.18. —  Validity  data  for  two  empirical  keys  for  Biographical  Data ,  CE602F, 

navigator  score 
[P.=0.S9\ 


Croup 

Key 

Part 

Formula 

N. 

M 

• 

SD, 

flla 

.'ill* 

Even*  . . 

Odds* 

I 

Rights  ... 

mm 

21.41 

20.88 

1.49 

0.19 

0.25 

Do 

. . .  .do* 

I 

Wrongs  .. 
R-W  ... 

11.91 

14.57 

2.89 

-.12 

-.21 

Do 

. . .  .do* 

1 

6.11 

4.11 

.14 

.21 

Oddi  .. 

Evens* 

m. 

Rights  ... 

mmm 

22.67 

21.98 

3.48 

.10 

.10 

Do 

. . .  .do* 

Wrongs  . . 

448 

19.18 

19.41 

1.20 

-.04 

-.10 

Do  . 

. . .  .do* 

■■ 

R-W  ... 

448 

1.49 

2.  $7 

.08 

.11 

Evens  . 
Do 

Odds* 

....do* 

Rights  ..V 
Wrong*  . , 

442 

442 

18.01 

1S.88 

18.24 

1S.S1 

3.18 

2.70 

-.04 

.07 

Wmmil 

Do 

....  do* 

n 

R-W  ... 

442 

2.11 

2.71 

S.S7 

-.06 

.02 

Odds  ... 

Evens*  . 

ii 

Rights  ... 

446 

28.08 

28.2! 

3.82 

-.02 

Do  . 

....do* 

ii 

Wrongs  .. 
R-W  ... 

446 

20.01 

19.10 

3.22 

.12 

.14 

Do 

....do* 

ii 

446 

8.0$ 

8.91 

6.3$ 

-.07 

-.08 

1  Assuming  an  unrestricted  stanine  standard  deviation  of  2.00. 

*  Number  of  scored  items=61. 

•  Number  of  scored  items=6$. 

4  Number  of  scored  ttems=4f. 

■Number  of  scored  items =62. 

Evaluation. — Part  I  of  the  inventory  proved  to  be  moderately  valid 
for  the  selection  cf  pilots.  Its  correlation  with  the  stanine  was  so  low 
(even  though  the  stanine  includes  as  one  component  the  score  on  Bio* 
graphical  Data  Blank,  CC602D)  that  considerable  uniqueness  is  evident 
The  new  items  in  part  I  will  therefore  add  noticeably  to  the  pilot  valid¬ 
ity  of  the  classification  form  of  the  Biographical  Data  Blank.  Part  I 
promises  some  additional  validity  as  scored  for  navigator  selection. 

Part  II  does  not  show  any  promise  of  validity  for  either  specialty, 
even  though  there  appear  to  be  a  few  valid  items.  It  is  worthy  of  com¬ 
ment  that  part  II  is  more  devoted  to  questions  concerning  the  exami¬ 
nee’s  personality  traits  and  less  to  the  factual  type  of  part  I. 

Biographical  Data  and  Pilot  Specialization 

Two  forms  of  this  type  of  test  were  developed  especially  with  the 
view  of  discriminating  between  promising  fighter  pilots  and  bomber 
pilots.  Neither  had  been  followed  through  at  the  time  this  chapter  was 
written,  but  they  will  be  described  for  the  record. 

(I)  Biographical  Data  Blank,  CE602FW. — This  form  was  constructed 
at  headquarters,  AAF  Training  Command,  Fort  Worth.*1  It  was  admin¬ 
istered  to  pilots  in  basic  schools  during  October  1943  along  with  seven 
other  experimental  tests. 

Although  this  study  was  concerned  primarily  with  pilots,  the  possible 
prediction  of  navigator  aptitude  was  considered  important.  Many  in¬ 
structors  Mt  that  navigational  ability  was  important  to  a  pilot  in  flying 
a  heavy  bomber.  Thus,  it  was  believed  that  information  which  would 
predict  navigator  success  could  help  differentiate  the  bomber  from  the 
fighter  pilot. 

Most  of  the  items  in  the  Fort  Worth  Biographical  Data  Blank  are 
identical  with  or  similar  to  items  in  the  previous  forms  of  the  Blank. 
This  form  contains  147  items  divided  into  the  usual  categories. 

u Chief  contributor:  C»pL  Ltunor  F.  Carter. 
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(2)  Biographical  Data  Blank ,  CE602F.'*— There  were  two  main 
criteria  for  the  selection  of  items  in  CF602F.  First,  no  questions  were 
to  be  taken  from  the  classification-battery  form  of  the  Blank,  CE602D. 
Second,  no  questions  were  to  be  taken  from  the  Fort  Worth  form, 
which  had  been  administered  previously  in  a  pilot  specialization  project. 
Consequently,  items  that  were  deemed  likely  to  discriminate  between 
fighter  and  bomber  pilots  either  were  selected  from  other  forms  of  the 
Biographical  Data  Blank  or  were  written  originally  for  CE602F.  The 
categories  include  marital  and  parental  status,  experience  in  saving 
money,  entertainment  preferences,  study  habits,  summer-camp  experi¬ 
ence,  relationships  to  parents,  and  home  environment. 

Test  CE602F  contains  61  items  divided  into  the  categories  described 
above,  and  requires  approximately  20  minutes  for  administration. 

Following  are  two  typical  items : 

To  how  many  social  clubs  or  organizations  have  you  belonged? 

A.  None. 

B.  1. 

C  2. 

D.  3. 

E.  4  or  more. 

When  you  have  a  little  extra  money,  you  usually : 

A.  Buy  some  luxury  you  have  wanted  for  a  long  time. 

B.  Get  a  good  meal  in  towa 

C  Go  on  a  date. 

D.  Go  out  with  the  "boys.” 

E.  Save  it 

Occupational  Experience  Blank,  CE603A  ** 

This  blank  was  designed  to  reveal  information  from  which  examinees 
could  be  classified  occupationally  according  to  training,  interest,  and  ex¬ 
perience.  It  was  based  on  the  hypothesis  that  different  occupational 
groups  possess  different  average  air-crew  aptitudes,  and  that  prediction 
of  success  or  failure  in  air  crew  can  be  improved  by  knowledge  of  the 
examinee’s  previous  work  experience. 

Description. — This  test  asks  for  specific  information  concerning  oc¬ 
cupations  and  training  of  students  before  and  during  their  Army  careers. 

(1)  Internal  characteristics. — The  blank  is  divided  into  five  parts. 
Part  I  asks  for  information  as  to  the  subjects  studied  in  high  school, 
college,  and  vocational  school,  and  the  number  of  semesters  they  were 
studied.  Part  II  asks  for  information  concerning  special  skills  or  abili¬ 
ties  acquired  outside  the  work  experience.  Part  III  provides  for  a  de¬ 
tailed  description  of  full-time  civilian  jobs  and  a  briefer  description  of 
part-time  and  temporary  jobs.  Part  IV  provided  for  a  description  of 
training  and  duty  assignments  in  the  Army.  Part  V  provided  for  an  ex- 

u  Developed  it  Psychological  Research  Unit  No.  J.  Chief  contributors:  Cpi  Stanley  Blum 
berg,  Lt.  John  1.  Lacey,  and  Lt.  Eli  A.  Lipman.  _  '  _ 

>»  Developed  at  Psychological  Research  Unit  No.  I.  Chief  contributor;  Cap*.  Seymour  P. 
Stein. 
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pression  by  the  student  of  nis  occupational  interests.  There  :s  a  one- 
page  supplement  to  the  blank  which  asks  eight  questions  of  biographical 
information. 

(2)  Administration. — All  examinees  arc  supposed  to  finish  the  blank. 

The  administrator  takes  up  one  item  at  a  time,  explaining  it  thoroughly  i 
before  the  examinees  answer  the  item.  The  time  required  is  45  minutes. 

(3)  Scoring. — Scoring  consists  of  classifying  the  blanks  according  to 
the  Dictionary  of  Occupational  Titles  of  the  United  States  Employment 
Service.  The  intention  was  to  obtain  quite  specific  classifications  and 
not  to  lump  various  occupations  into  broad  categories.  After  the  exami¬ 
nees  were  classified,  they  were  divided  into  occupationally  homogeneous 
groups.  After  primary  training,  each  occupational  group  was  to  be  com¬ 
pared  to  the  whole  group  to  sec  whether  there  is  a  significant  difference 
in  the  graduation  rate. 

Statistical  results. — No  data  arc  available. 

Personal  Data  form,  CE605A 

The  purpose  of  this  inventory  14  is  to  measure  susceptibility  to  com¬ 
bat  and  near-combat  neuroses  by  means  of  carefully  selected  biographi¬ 
cal  items. 

Description. — The  problem  of  predicting  combat  neurosis,  or  sus,-  1 
ccptibility  to  neurosis,  is  one  which  was  bypassed  during  the  early  days 
of  the  AAF  Psychology  Program  due  to  the  emphasis  on  rapid  dcvel-  ' 
opment  of  selective  instruments  that  validated  against  the  criterion  of  . 
successful  completion  of  some  type  of  air-crew  training.  As  selective 
techniques  became  more  refined,  and  the  war  progressed  to  the  extent 
that  combat  criteria  were  becoming  available,  some  attention  swung  to 
the  problem  of  development  of  instruments  that  might  predict  successful 
combat  performance  and  susceptibility  to  combat  fatigue  or  combat 
neurosis.  Several  projective  procedures  were  developed  with  this  in 
mind  (sec  eh.  24).  This  instrument,  however,  represents  the  first  bio¬ 
graphical-data  approach  to  the  problem.  The  rationale  underlying  the 
test  rests  on  the  assumption  that  an  individual’s  history,  to  the  extent 
that  it  can  be  obtained  and  correctly  evaluated,  is  the  best  single  index 
of  his  future  performance. 

(1)  Internal  characteristics.-- This  instrument  consists  of  13.9  items, 
each  with  from  2  to  5  alternative  responses.  These  items  deal  primarily 
with  aspects  of  familial  status  and  personality  development.  Some  of  the 
items  were  taken  from  existing  biographical  inventories,  and  others  were 
written  esjHcially  for  this  test.  Sample  items  are: 

As  a  child  the  teachers  I  liked  best  were: 

A.  Middle  aged  women. 

B.  Young  women. 

11  Developed  a  Psychological  Keeearch  Unit  No.  1.  Chief  contributor*;  Sgt.  Gerald  S.  Blunt. 

Lt.  Vivian  Fisher,  and  Capt.  Donald  E.  Super. 
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C  Middle  aged  men. 

D.  Young  men. 

My  parents  always  considered  my  behavior : 

A.  Much  better  than  that  of  other  children. 

B.  Slightly  better  than  that  of  other  children. 

C.  Slightly  worse  than  that  of  other  children. 

D.  Much  worse  than  that  of  other  children. 

(2)  Administration. — Pertinent  directions  arc : 

In  this  booklet  you  are  asked  for  certain  information  about  your  personal  history. 
This  is  not  a  test  in  the  ordinary  sense;  there  arc  no  "right”  answers  except  the 
ones  which  reflect  your  own  particular  past  experiences  and  situations. 

To  a  large  extent  your  success  in  flying  depends  on  how  well  you  are  understood 
by  those  in  charge.  All  of  the  information  asked  for  in  this  booklet  is  for  the  pur¬ 
pose  of  aiding  your  superior  officers  in  understanding  you.  It  is  to  your  own  advan¬ 
tage,  therefore,  to  indicate  your  answers  to  the  items  in  this  booklet  as  carefully, 
completely  and  honestly  as  you  can. 

Whenever  the  word  "parents"  or  "father”  or  "mother”  is  used  in  the  following 
questions  and  statements,  it  will  be  understood  to  include,  when  appropriate  or  fit¬ 
ting  in  your  case,  any  such  words  as  "foster  parents,”  "adopted  parents,”  “step 
parents,”  "legal  guardians”  or  "foster  father”  *  *  * 

(3)  Scoring. — A  priori  keys  were  not  used.  Scoring  was  accom¬ 
plished  and  validities  were  obtained  by  means  of  cross-validation  data 
using  a  training  criterion.  Two  scores  were  obtained;  one  based  on  pos¬ 
itively  weighted  responses  and  one  oti  negatively  weighted  responses. 

Statistical  results. — The  data  that  follow  were  computed  for  a  sample 
of  738  pilots  in  primary  training,  originally  tested  in  May  1944  at  Psy¬ 
chological  Research  Unit  No.  1. 

(1)  Item  validity. — The  distribution  of  phis  based  on  item  analysis 
used  in  the  cross-validation  study  is  presented  in  table  27.19. 


Table  27.19. — Distribution  of  this  based  on  a  sample  of  733  pilots  in  primary 
training,  using  a  graduation-eliini'ialion  criterion,  for  the  Personal  Data 

Form,  CE605A 

phi  t  (odds)  I  t  (even*)  Phi  l  (odd*)  t  (evens) 


0.18  to  0.21.1. 
.13  to  .17... 
.08  to  ,!2. . . 
.03  to  -07... 
-.02  to  .02... 


t  (odd*) 

t  (even*) 

Phi 

1  (odd*) 

2 

2 

-.07  lo  —.03  .. 

8S 

13 

8 

—.12  to  —.08  .. 

49 

3$ 

44 

-.17  to  —.13  .. 

10 

as 

88 

-.22  to  —.18  .. 

3 

106 

111 

In  order  not  to  confuse  the  form  of  the  phi  distribution,  the  three 
questions  of  a  two-choice  form  which  are  contained  in  the  test  were 
dropped  from  this  listing.  None  of  th  sc  readied  or  exceeded  the  5  per¬ 
cent  icvcl  of  significance. 

(2)  Test  validity.— V alidation  data  were  computed  for  a  sample  of 
738  pilots,  which  was  split  into  two  equal  groups,  odds  and  evens.  Sepa¬ 
rate  item  analyses  were  accomplished  for  each  subsample,  and  two  scor¬ 
ing  keys  devised.  The  criteria  for  scoring  a  response  were:  (a)  a  phi 
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significant  at  or  beyond  the  5  percent  level  ,(010)  i  (b)  a  split  of  18-15 
or  better. 

'I'lie  evens  group  was  scored  with  the  odds  key,  and  toe  o<lds  group 
was  scored  with  the  evens  key.  The  validities  obtained  are  presented  in 


table  27.20, 

Table  27.20.-  Validity  data  based  on  two  groups  of  pilots  in  primary  training ,  using 
a  graduation-elimination  criterion,  for  the  Personal  Data  Form,  CE605A * 


Group 

Score 

[ 

M, 

SD, 

fMl 

Odds  stored  with  evens  key.* . 

Evens  scored  with  odds  key.* . 

Rights  ... 
Wrongs  . 
Rights  ... 
Wrongs  . 

1..21 

8.91 

12.26 

13.23 

10.78 

9.29 

11.33 

13.17 

2.78 

2.38 

2.74 

2.93 

0.09 

-.09 

.09 

.01 

O.Ov 

-.08 

.10 

-.03 

‘For  both  groups  N,=369,  #,=0.81. 

*  Corrected  to  an  unrestricted  alanine  standard  deviation  of  2.00. 

‘Number  of  scored  items =38, 

‘Number  of  scored  items =46. 

For  an  N  of  369,  an  r n,  of  0.10  is  significant  at  the  5  percent  level 
of  confidence,  and  an  r»u  of  0.13  is  significant  at  the  1  percent  level.  It 
can  be  seen  that  none  of  the  uncorrected  correlations  were  significant  at 
either  of  these  levels. 

Evaluation. — There  would  seem  to  be  an  excessive  number  of  phis 
beyond  the  confidence  limits,  but,  in  view  of  the  apparent  unimodality 
of  the  distribution  with  its  central  tendency  at  zero,  and  the  failure  of 
the  cross  validation  test  to  show  significant  biserial  correlations,  it  is 
probable  that  there  are  few,  if  any,  genuinely  valid  items  in  this  collec¬ 
tion  for  the  prediction  of  primary  pilot  training  success. 

While  a  relatively  large  number  of  individual  items  appeared  to  be 
valid  for  me  prediction  of  success  in  primary  pilot  training,  the  biserial 
correlations  for  total  scores  did  not  support  that  promise  of  validity.  It 
is  to  be  remembered  that  this  instrument  was  designed  to  predict  sus¬ 
ceptibility  to  combat  neurosis  and  that  combat  criteria  were  not  em¬ 
ployed  in  obtaining  test  validity.  A  validation  study  that  will  test  the 
original  hypothesis  concerning  the  value  of  this  test  is  still  to  be  made. 

EVALUATION  AND  CONCLUSIONS  , 

Statistical  results  for  some  of  the  tests  reviewed  in  this  chapter  reveal 
the  empirical  truth  of  the  hypothesis  that  aptitude  for  air-crew  training 
can  be  predicted  from  certain  biographical  information.  The  classifica¬ 
tion-battery  form  of  the  Biographical  Data  Blank  (CE602D)  was  shown 
to  be  a  satisfactory  measure  predictive  of  pilot  and  navigator  success. 

No  attempt  was  uade  to  develop  a  scoring  key  for  the  bombardier. 
The  reasons  were  several.  The  task  of  the  bombardier,  unlike  those  of 
pilot  and  navigate  was  without  precedent  either  in  military  or  civilian 
life,  and  is,  there! t  re,  somewhat  characterless.  Its  resemblances  to  voca¬ 
tions  or  avocation  arc  limited,  and  so  hypotheses  regarding  items  are 
difficult  to  invent.  Another  reason  was  the  lack  of  a  training  criterion 
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in  which  reasonable  confidence  could  be  placed.  Circular  error  was 
highly  unreliable  as  a  measure  of  bombing  accuracy;  and  graduation, 
which  depended  heavily  upon  it.  was  hard  to  predict  by  the  best  of 
tests.  Coupled  with  this  was  the  fact  that  numbers  of  trainees  in  the 
early  days  were  small,  and  validation  data  in  sufficient  quantities  were 
slow  in  accumulating.  Failure  to  develop  a  valid  key  for  the  bombardiers 
in  the  General  Information  test  was  also  discouraging  of  success  with 
the  Biographical  Data  Blank.  For  completeness  in  a  research  program, 
however,  the  validation  of  items  for  bombardier  would  have  been  de¬ 
sirable. 

Factor  analyses  did  not  reveal  all  the  reasons  why  the  Biographical 
Data  Blank  is  valid  for  pilots.  The  test  has  considerable  unknown  valid 
variance.  It  is  toward  the  understanding  of  this  unique  variance  and  its 
further  exploitation  that  new  work  should  be  directed  in  order  to  im¬ 
prove  the  piio.  score.  While  the  valid  variance  of  the  navigator  score 
is  probably  fully  accounted  for,  this  score  could  be  considerably  im¬ 
proved  in  reliability  and  therefore  in  validity.  This  would  mean  the 
search  for  a  large  number  of  new  items  emphasizing  the  mathematical- 
background  factor,  in  order  to  maximize  the  usefulness  for  the  predic¬ 
tion  of  navigation  success.  When  a  satisfactory  criterion  is  found  for 
the  bombardier,  attention  should  be  given  to  the  writing  and  validation 
of  items  for  that  specialty. 
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CHAPTER  IDOIMKIl 


Factorial  Picture  of  Tests 
and  Criteria1 


INTRODUCTION 
Reasons  for  a  Factorial  Picture 

It  is  the  purpose  of  this  chapter  to  summarize  and  to  evaluate  what 
is  known  concerning  the  factorial  composition  of  the  tests  described  in 
this  volume,  and  of  the  criteria  that  they  were  designed  to  predict  This 
is  done  with  the  conviction  that  the  most  significant  information  one  can 
have  concerning  cither  tests  or  criteria  is  in  the  form  of  factorial  de- 
scription.  The  common  factors  serve  as  joint  reference  categories  for 
both  alike. 

Factorial  knowledge  of  tests  enables  us  to  predict  validities  in  advance 
if  we  also  know  the  relative  weights  of  the  factors  in  the  criteria.  Gen¬ 
eralizations  can  be  made,  therefore,  beyond  the  usual  facts  regarding 
validities  of  specific  tests  for  specific  criteria.  Having  a  large  battery 
of  tests  covering  most  of  the  human  traits,  each  test  described  in  terms 
of  factors  and  their  loadings,  one  could  then  readily  fit  tests  to  new 
selective  uses  merely  by  analyzing  the  new  criteria.  Having  the  specifi¬ 
cations  for  a  job  reduced  to  terms  of  factor  loadings,  one  would  then  be 
ready  to  select  a  battery  of  tests  which  would  yield  near  maximum 
validity.  If  pure  tests  of  factors  have  been  constructed,  an  economical 
battery  is  assured.  Pure  tests  can  be  arrived  at  by  factorial  procedures. 

P'iUi  of  the  Chapter 

Several  factorial  studies  have  been  reported  in  preceding  chapters, 
each  where  it  most  appropriately  applied  to  a  group  of  tests.  Before  pre¬ 
senting  an  over-all  summary,  it  will  be  necessary  to  give  an  account  of 
four  general  analyses  whose  results  enter  prominently  into  considera¬ 
tion.  These  analyses  were  based  upon  four  of  the  classification  batteries 
— those  of  December  1942,  July  1943,  November  1943,  and  September 
1944.  These  analyses  will  be  presented  and  then  a  list  of  the  common 
factors  with  the  best  available  definitions.  Tests  will  then  be  grouped 
according  to  their  leading  factors  and  tabulated  with  weighted-average 
factor  loadings,  with  communalities  based  upon  these  averages,  with 
estimates  of  reliability,  and  wifi  weighted-average  validities  for  pilot, 
bombardier,  and  navigator  trail  ng  where  those  data  are  available.  A 

‘  Written  by  lb*  Editor. 
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list  of  tests  in  alphabetical  order  with  the  same  kind  of  data  will  be  given 
in  Appendix  B. 

One  feature  of  the  chapter  is  almost  unique  in  vocational  psychology. 
There  arc  presented  estimated  factor  loadings  for  the  three  training 
criteria,  as  was  forecast  at  the  end  of  chapter  1.  From  the  estimates  of 
loadings  of  factors  not  now  represented  in  the  classification  battery,  we 
can  see  just  what  types  of  tests  might  well  be  added  to  the  battery  in 
order  to  increase  the  coverage  of  the  criteria  and  so  improve  predictions 
of  success. 


ANALYSIS  OF  CLASSIFICATION  BATTERIES 


The  Data 


It  was  a  general  policy  to  obtain  intcrcorrelations  of  tests  in  all  clas¬ 
sification  batteries  very  early  after  they  went  into  effect,  based  on  very 
large  samples  of  unclassified  aviation  students  from  different  examining 
units.  These  intcrcorrelations  were  not  only  used  later  as  a  basis  for 
revising  regression  weights,  but  were  also  good  material  for  factorial 
studies.  The  analyses  have  been  based  upon  unusually  large  samples, 
consequently,  the  results  should  be  quite  stable. 

The  December  1942  Battery .* — For  the  December  1942  battery,  the 
numbers  of  cases  were  3,254  and  4,774,  the  smaller  number  applying  to 
the  new  revised  forms  of  tests  going  into  the  battery  for  the  first  time. 
To  the  matrix  were  added  validity  coefficients  for  the  criteria  for  bom¬ 
bardier,  navigator,  and  pilot  training.  Those  for  the  bombardier  were 
based  upon  1,829  students  (1,453  graduates  and  376  eliminees).  Those 
for  the  navigator  were  based  upon  .r-averages  of  two  samples,  1,970 
(1,554  graduates  and  416  eliminees)  in  one,  and  731  (633  graduates  and 
98  eliminees)  in  the  other.  Those  for  the  pilot  were  based  mostly  on 
r-averages  from  a  total  of  10,925  students.  The  intcrcorrelations  of  the 
criteria  were  guessed  from  previous  estimates  (see  table  28.14)  of  their 
common-factor  loadings.  The  correlation  mJlrix  for  this  battery  is  pre¬ 
sented  in  table  28.1. 


The  July  1913  Battery .* — The  intcrcorrelations  for  this  battery  are 
presented  in  table  28.2.  They  were  based  upon  3,000  unclassified  stu¬ 
dents,  1,000  from  each  of  the  three  original  examining  units.  Along  with 
this  battery  were  included  in  the  analysis  four  composite  scores — a 
weighted  aggregate  for  '■ach  air-crew  specialty  and  the  officer-quality 
composite  score.  The  purpose  of  these  inclusions  was  to  determine  how 
these  composites  were  weighted  faciorially.  (A  more  satisfactory  method 
of  estimating  factor  loadings  in  the  stanines  would  have  been  by  the 
correlation  of  weighted  sums  with  each  factor.)  Being  very  complex, 
and  having  very  high  communalitics.  they  made  a  factorial  solution 
more  difficult.  No  corrections  were  made  for  the  spuriousness  of  the 


*  Tbt  !ruly>n  *u 

•  The  kiulym  »u 


eireuied  by  Cop I.  tJeyd  G.  Humphrey!  *r>d  Cop*.  Jehu  J.  Lacey, 
executed  by  S/byl.  J.  Gordon  Likin  end  Copt  Uoyd  G.  Humphrey*. 
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part-whole  correlations  so  that  the  communalities  of  these  variables  j 
slightly  exceeded  1.00  as  should  have  been  expected.  From  the  compar-  j 
ison  of  loadings  from  this  analysis  with  those  from  other  analysesj'ft  it  j 
apparent  that  no  other  serious  distorting  effects  occurred.  J 

The  November  1943  Battery* — The  intercorrelations  of  this  battery  j 
were  based  upon  1,900  unclassified  aviation  students  sampled  from  10  •,  j 
examining  units.  Added  to  the  list  of  variables,  are  achievement  tests  in 
the  subjects  of  history,  geography,  and  physics;  also  two  experimental  j 
tests — Decoding,  CI214AX2,  and  Vocabulary  (AAF),  CI604B.  The  in- 
lcrcorrclations  of  these  with  the  battery  tests  and  with  each  other  were  l 
based  upon  543  unclassified  students  tested  at  Psychological  Research  Unit  j 
No.  3.  '  j 

The  achievement  examinations  had  been  designed  to  evaluate  students  j 
who  had  completed  5  months  of  college  training  provided  by  the  AAF,  . 
which  included  the  three  subjects  mentioned  along  with  English  and  ! 
mathematics.  The  last  two  subjects  were  covered  in  this  battery  by  Read¬ 
ing  Comprehension  and  Mathematics  A.  The  objectives  of  the  AAF  Eng¬ 
lish  course  stressed  reading.  Here  was  an  opportunity  to  bring  academic 
achievement  into  the  factorial  picture.  This  was  regarded  to  be  perti-  t 

nent  because  of  the  previous  discovery  of  the  mechanical-experience  j 

factor  and  what  seemed  to  be  either  a  science-education  factor  or  a  ] 
mathematics-background  factor.  The  latter  area  needed  some  darifica-  ] 
tion,  which  it  was  hoped  would  be  provided  by  the  inclusion  of  both  ! 
physics  and  mathematics  examinations  in  the  matrix.  It  was  also  desired 
to  know  how  much  kinship  existed  between  a  physics-examination  score 
and  the  mechanical  factor.  * 

This  particular  analysis  presented  one  or  two  technical  difficulties. 

The  absence  of  Numerical  Operations,  Mechanical  Information,  and 
Speed  of  Identification  from  the  battery  meant  the  lack  of  three  excel-  ( 
lent  reference  tests.  The  perceptual-speed  factor  did  emerge,  but  it  was 
impossible  to  separate  the  numerical  factor  from  general  reasoning  in 
spite  of  considerable  effort  to  do  so. 

The  presence  of  the  vocabulary  test  was  favorable  for  the  appearance  | 
of  the  verbal  factor,  which  rarely  fails  to  emerge.  In  view  of  the  heavy 
verbal  loading  previously  found  in  the  Technical  Vocabulary  and  General 
Information  tests,  it  was  desired  to  know  just  how  large  the  verbal  fac¬ 
tor  loading  would  be  in  a  nontechnical  vocabulary  test. 

The  September  1944  Battery 3 — The  intercorrelations  for  this  battery 
were  based  upon  testing  of  8,158  unclassified  aviation  students  at  Med-  » 

ical  and  Psychological  Examining  Unit  No.  8.  They  are  presented  in  j 

tal*le  28.4.  One  correction  was  made  in  the  coefficient  of  correlation  be¬ 
tween  General  Information,  CE505F,  and  Mechanical  Information, 
CI905R,  in  view  of  the  fact  that  they  bail  six  items  in  common.  The  re- 

*  The  analysis  was  executed  by  Capt.  UoyJ  G.  Humphrey*  and  S/Srt,  Wayne  S.  ti—en— 

•  Tbi»  analysis  wss  executed  by  C*|>«.  John  I.  Lscey  and  Syt  Harsld  H.  Sinter. 
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suiting  correlation  1  ctwecn  thc«<’.  two  vat  iablcs  may  be  regarded  as  hav¬ 
ing  that  much  spcci  ic  overlap  expunged  but  with  other  common  variance 
unchanged. 

There  arc  two  aspects  of  this  analysis  that  arc  unsatisfactory.  One 
is  the  impossibility  of  separating  the  pilot-interest  and  mathematical- 
background  factors.  The  two  Biographical  Data  scores  come  out  together 
on  the  same  axis,  which  otherwise  appears  to  be  the  pilot-interest  vector. 
Had  Mathematics  A  been  in  this  battery,  we  could  have  confidently  ex¬ 
pected  the  separation.  The  other  discrepancy  is  that  the  Rudder  Control 
test  comes  out  on  the  psychomotor-coordination  factor,  which  was  not 
true  in  the  analysis  of  the  November  1943  battery.  Although  this  also 
happened  in  the  analysis  of  the  carefulness  battery  (see  ch.  25),  it  is 
believed  that  this  test  actually  has  a  unique  factor,  in  common  with  only 
the  Rotary  Pursuit  test  in  the  classification  battery.  What  third  type  of 
test  would  be  needed  in  order  to  effect  the  expected  separation  is  not 
known. 

The  centtoid  factor  loadings  and  communalitics  for  the  four  analyses 
are  given  in  tables  28.5,  28.6,  28.7,  and  28.8.  The  rotated  factor  loadings 
are  given  in  tables  28.9,  28.10,  28.11,  and  28.12. 

The  Apparatus  Tests.- -In  these  batteries  appear  several  apparatus 
terts  not  described  previously  in  this  volume.  They  are  described  very 
fully  in  report  No.  4,  so  a  minimum  of  description  will  be  given  here. 

Discrimination  Reaction  Time,  CP611D,  was  designed  as  a  test  of 
speed  of  decision  and  reaction.  There  are  four  stimulus  patterns,  each 
consisting  of  a  pair  of  lights,  one  red  and  one  green.  Corresponding  to 
each  stimulus  pattern  is  a  micro-switch,  the  four  switches  being  arranged 
in  a  diamond-shaped  pattern.  The  position  of  each  switch — upper,  lower, 
right,  and  left — is  associated  with  a  corresponding  direction  of  the  red 
light  with  respect  to  the  green  light  (see  fig.  5.2  for  a  schematic  dia¬ 
gram  of  the  apparatus).  The  test  requires  80  reactions  with  stimuli  given 
in  random  sequence.  The  score  is  the  total  accumulated  time  between 
stimulus  and  correct  response.  A  white  signal  light  informs  the  exami¬ 
nee  of  the  correctness  of  each  response  he  makes. 

The  Finger  Dexterity  Test,  CM116A,  consists  of  a  pegboard  having 
48  square  peg?  in  square  holes.  Each  peg  can  be  grasped  by  means  of  a 
thick  circular  button  at  its  top.  The  examinee  lifts  each  peg  from  its 
hole,  turns  it  180°  clockwise,  and  resets  it  in  its  hole.  The  score  is  the 
total  number  of  pegs  turned  i:i  the  time  allowed. 

The  Aiming  Stress  Test,  CF.2UA,  is  a  type  of  steadiness  test  in  which 
the  examinee  tries  to  keep  a  rod  delicately  balanced  on  a  fulcrum  so  as 
to  avoid  contact  of  the  end  of  the  rod  with  the  sides  of  the  hole  into 
which  it  is  inserted.  During  this  activity,  he  is  distracted  by  a  "patter" 
which  is  intended  to  be  disturbing  to  him,  and  by  secondary  mental  tasks 
such  as  counting  flashes  of  light.  The  score  is  the  total  time  of  contact 
during  a  given  interval. 
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Table  28.12. — RoU.  led  factor-  routings  for  the  September  1944  classification  battery  (N—SJSS) 
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The  Rotary  Pursuit  Test,  CM803A,  is  a  modified  Koerth  pursuit  test 
in  which  the  examinee  tries  to  keep  a  prod  in  contact  with  a  metallic 
spot  on  a  phonograph-type  disk  which  is  rotating  at  the  rate  of  one  revo¬ 
lution  per  second.  In  the  same  test  with  divided  attention.  CP-401  B, 
there  is  a  second  simultaneous  task  for  the  left  hand  which  requires  the 
examinee  to  keep  one  of  two  keys  dosed  in  correspondence  with  one  of 
two  lights. 

In  the  Rudder  Control  Test,  CM120B,  the  examinee  sits  in  a  mock 
cockpit  of  an  airplane.  His  own  weight  throws  the  seat  off  balance  un¬ 
less  he  applies  correction  by  means  of  a  rudder-control  mechanism.  The 
score  is  the,  total  time  he  keeps  the  cockpit  pointed  directly  at  a  target 
light  straight  ahead.  The  task  requires  a  keen  appreciation  of  loss  of 
balance  and  a  quick  but  not  over-controlled  correction  made  by  leg 
action.  j 

l 

The  Factor* 

The  statistical  results  from  these  analyses  are  summarized  factor  by  • 

factor  in  the  following  paragraphs.  The  tests  as  welt  as  the  factors  are  I 

very  much  in  common  to  all  four  analyses.  It  should  be  noted,  however, 
that  several  tests  changed  form  with  change  of  battery.  In  December  J 

1942,  the  General  Information  Test  was  the  Technical  Vocabulary  and  ; 

Information  Test,  CES05C;  in  July  1943,  it  was  General  Information,  • 

CE505D;  in  November  1943,  it  was  General  Information,  CE505E;  1 

and  in  September  1944,  it  was  CE505F.  Reading  Comprehension  was  j 

form  CI614G  in  the  first  two  batteries  and  CI614H  in  the  last  two.  ! 

Mathematics  A  was  CI702E  in  the  first  battery  and  CI702F  in  the  sec-  ( 

ond  and  third.  Mathematics  B  provided  an  unweighted  combination  of 
two  scores,  one  from  C1206B  (Arithmetic  Reasoning)  and  one  from 
CI706A  (Numerical  Approximation),  in  the  first  battery.  The  score  in 
the  last  three  batteries  was  derived  from  a  single  test,  CI206C  (Arith-  / 
metic  Reasoning).  It  was  Mechanical  Principles,  CI9Q3A,  in  the  first  | 

two  batteries,  and  CI903B  in  the  last  two.  It  was  Rotary  Pursuit, 

CM803A  (without  the  divided  attention  feature)  in  the  first  battery 
and  CP410B  (with  divided  aTfcntion)  in  the  last  three  batteries. 

In  the  tabulations  below,  no  test  is  listed  in  any  group  unless  its  load¬ 
ing  for  the  factor  exceeds  0.20  in  ail  batteries  in  which  it  had  a  loading 
at  all.  A  blank  means  that  a  test  or  a  score  was  absent  from  a  particular 
analysis.  For  the  November  1943  analysis,  no  data  are  given  for  the 
numerical  or  general-reasoning  factors  because  of  the  failure  wo  sepa¬ 
rate  the  two. 

Factor  I  is  the  common  verbal  factor.  All  repeated  estimates  for  any 
test  are  rather  consistent  with  one  or  two  exceptions.  The  drop  from 
a  loading  of  0.53  in  Mathematics  A  to  0.37  and  0.29  coincides  with  a 
change  of  form,  which  might  indicate  that  the  new  form  lost  much  of 
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Rotated  factor  I  is  dts.ribcd  by  the  following  data: 


Factor  loading! 

Tm  mom 

December  1942 
battery 

July  194} 
battery 

November  194} 
battery 

September  1944 
battery 

Central  Information  (navigator  ri art) 

0.17 

.54 

0.6S 

.66 

M4P 

0.S7 

.29 

•  •  •  • 

0.S2 

.5) 

.3? 

•  •  a  • 

.40 

at 

Technical  Vocabulary  and 

Information  (bombardier  (tore)  ... 
Central  Information  (pilot  Kore;  ... 

.44 

.41 

,21 

ant 

.41 

.}} 

•  ••• 
.37 
.24 

.29 

.71 

.46 

its  verbal  variance,  h  should  also  be  noted,  however,  that  another  factor 
— mathematical  background — did  not  emerge  in  the  December  1942 
analysis.  It  is  possible  that  its  variance  became  embroiled  with  the  ver¬ 
bal  variance  in  that  analysis.  The  drop  from  0.77  to  0.63  in  going  from 
Technical  Vocabulary  and  Information,  CE505C,  to  General  Informa¬ 
tion,  CF.505D,  navigator  score,  is  difficult  to  explain,  since  the  change 
was  in  name  of  the  test  only,  so  far  as  the  navigator  score  was  con¬ 
cerned. 

It  is  interesting  to  note  that  a  technical  vocabulary  test  can  have  as 
high  a  verbal  loading  as  a  nontechnical  vocabulary  test.  The  selection 
of  items  valid  for  the  navigator,  however,  probably  accounts  for  this 
fact,  since  the  verbal  loading  for  the  pilot  score  is  only  approximately 
0.40.  The  navigator  criterion  has  a  positive  verbal  loading,  whereas  the 
pilot  criterion  has  not  (sec  table  28.14).  The  selection  of  items  correlated 
with  the  two  criteria  would  therefore  yield  different  results  with  respect 
to  total-score  verbal  variance. 

Rotated  factor  II  is  described  by  the  following  data: 


Factor  laafiagi 

* 

Tuft  mbm 

December  1942 
battery 

July  194) 
battery 

September  1*44 
battery 

Spatial  Orientation  1  . 

Speed  of  IdeatibcatWa  . 

0.6* 

M 

*0.44 

M 

0.62 

0.41 

.SS 

.4* 

JO 

on 

Spatial  Orientation  II  . . 

M 

.44 

U 

Dial  end  Table  Heading  . 

launwent  Coapeebmue*  II  ... 
laitiamnt  Comptehtnaioa  I . 

JO 

M 

M 

•  a  e  a 

a  a  a  • 

JO 

adW 

•  «  a  • 

This  is  the  familiar  perceptual-speed  factor  with  the  usual  loadings  in 
the  same  tests. 

Rotated  factor  III  is  described  by  the  following  data: 


Factor  toadUga 

■■i 

December  1*42 
battery 

July  1*43 
battery 

September  1*44 
battery 

i 


This  is  the  common  numerical  factor  with  its  usual  very  consistent  • 
loadings.  Two  important  results  should  be  pointed  out  here.  One  is  that 
the  back  of  the  Numerical  Operations  test  sheet  lias  consistently  higher 
loadings  than  the  front  in  ail  analyses.  The  hack  of  the  sheet  is  com* 
posed  of  problems  in  subtraction  and  division,  whereas  the  front  is 
composed  of  problems  in  addition  and  multiplication.  The  back  also 
provides  five  alternative  responses,  whereas  the  front  provides  only  two 
alternatives.  Which  of  th  :se  distinctions  is  responsible  for  the  finding  is 
not  clear.  The  other  notable  feature  is  the  drop  in  loading  for  Mathe- 
matics  B  coincident  with  the  dropping  of  Numerical  Approximations 
from  that  test.  Numerical  Approximations  is  much  more  of  a  computa¬ 
tions  task  than  is  Arithmetic  Reasoning. 

It  should  also  be  remarked  that  when  front  and  back  scores  are  com¬ 
bined  and  analyzed  as  a  single  test  variable,  the  numerical  loading  drops 
to  the  region  of  0.65  to  0.70.  This  must  mean  that  there  is  a  specific 
variance  common  to  the  two  parts  of  the  Numerical  Operations  test  in 
addition  to  their  common  variance  in  the  numerical  factor.  In  the  an¬ 
alyses  reported  here  the  two  variances  have  probably  combined. 

Rotated  factor  IV  is  described  by  the  following  data : 


Factor  ImSiq 


Tot  MM 

December  1*42 
battery 

Inly  1*41 
battery 

Maeembcr  1*41 
battery 

September  1*44 
battery 

149 

a  47 

iW 

MF 

.4* 

.41 

M 

4* 

.41 

.44 

Di  (crimination  Reaction  Time  .... 
Instrument  Comprehension  II  .... 
Instrument  Comprehension  1 . 

M 

•  ••• 

M 

•  •n* 

•  ••• 

3 

.4* 

M 

.47 

M 

"ji 

This  is  the  spatial-relations  (space  I)  factor  of  which  tlic  Complex 
Coordination  test  has  usually  been  a  leading  measure.  There  is  some  in¬ 
dication  that  printed  tests,  such  as  Instrument  Comprehension  II,  will 
provide  an  equally  good  measure  of  the  factor.  The  surprising  strength 
of  correlations  between  the  psychomotor  tests  in  this  group  with  certain 
printed  tests  is  mainly  accounted  for  by  this  common  factor. 

Rotated  factor  V  is  described  by  the  following  data : 


•n 

i 

1 

Teal 

Dumber  1*41 
battery 

September  1*44 
battery 

an 

41 

M 

an 

•  •  *  • 

41 

This  is  the  visualization  factor  usually  prominent  in  tasks  in  which 
some  pictorial  content  must  be  mentally  manipulated  or  transformed. 
Since  its  only  strong  loading  in  these  batteries  i*  in  Mechanical  Prin¬ 
ciples,  it  docs  not  appear  in  all  four  analyses.  It  has  appeared  in  a 
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number  of  experincntal  battery  analyses,  however,  whe/e  other  strong 
tests  in  it  have  bcci  present 
Rotated  factor  "\  I  i5  described  by  the  following  data: 


i 


Tat  name 

Factor  leadings 

December  1942 
battery 

July  194) 
battery 

November  1943 
battery 

September  1944 
battery 

Mechanical  Information . 

0.94 

•  •  •  • 

•  •  o  • 

0.64 

Mechanical  Principle* . 

.54 

0.64 

0.50 

.37 

Reading  Compreaenaien  . 

40 

.34 

.33 

.00 

General  Information  (pile  acoro). 

.19 

40 

43 

.34 

Two-Hand  Coordination . 

47 

.0** 

.39 

.36 

Biographical  Dau  (pilot  acoro)  ... 

•  •  •  • 

40 

S3 

.39 

This  is  the  mechanical-experience  factor  which  is  exceptionally  strong 
in  the  Mechanical  Information  test.  Estimates  of  its  loading  in  this  test 
are  even  higher  in  other  analyses.  The  Mechanical  Principles  test  out¬ 
strips  it  in  validity  for  pilot  training,  because  it  includes  other  valid 
factors  also,  such  as  visualization  and  spatial  relations.  One  noteworthy 
fact  is  that  four  tests  that  were  not  designed  as  mechanical  tests  turn 
out  to  have  substantial  loadings  in  this  factor.  One  reason  that  probably 
applies  to  two  of  these  tests  is  that  these  tests  were  developed  by  selec¬ 
tion  of  items  that  correlate  with  the  pilot  criterion.  Item  validation  thus 
tends  to  work  toward  complicating  a  test  factorially  rather  than  purify¬ 
ing  it  unless  one  also  selects  or  rejects  items  that  have  been  validated 
against  pure  factor  criteria.  Thus,  in  selecting  new  items  for  Biographi¬ 
cal  Data,  one  might  make  sure  that  their  correlation  with  the  score  on 
Mechanical  Information  is  very  low. 

Rotated  factor  VII  is  described  by  the  following  data: 


Factor  leading* 


Teat  name 

December  1942 
battery 

July  1943 
battery 

November  1943 
battery 

September  1944 
battery 

Rotary  Twioit  ..... _ 

0.S2 

.40 

.36 

44 

44 

JL  (i 

A  42 

fl  11 

Two-Rand  Coordination  . 

'47 

14 

.41 

Aiming  Strata  . 

Finger  Deaterity . 

It 

o  o  o  o 

«f 

.31 

.44 

Complex  Coordination  . 

.44 

.47 

This  is  a  factor  confined  to  psychomotor  tests  and  so  its  interpreta¬ 
tion  must  be  given  accordingly.  The  b^st  name  at  present  seems  to  be 
that  of  psychomotor  coordination.  It  is  strongest  in  the  Rotary  Pursuit 
test,  but  olso  substantial  in  Complex  Coordination  and  others.  It  is  absent 
from  the  Rudder  Control  test,  which  requires  coordination  of  leg-muscle 
action,  in  the  November  1943  anatysis  but  not  in  the  September  1944 
analysis  (see  table  28.12).*  If  the  former  result  is  confirmed,  the  indi¬ 
cation  is  that  this  factor  is  characteristic  of  arm-muscle  activity  but  not 
of  leg-muscle  activity.  If  the  latter  result  is  confirmed,  it  is  a  more 

n**»  *«,  (•«*.»>*•  Rod*r  Control  miIwMi 
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general  motor  factor.  Since  it  is  substantial  in  one-armed  activity,  as  in 
Finger  Dexterity,  Rotary  Pursuit,  and  Aiming  Stress,  it  is  not  a  matter 
of  coordinating  the  two  arms.  General  muscular  agility  seems  to  be  the 
best  characterization  of  it. 

Rotated  factor  VIII  is  described  by  data  from  two  analyses: 


• 

Factor  loading* 

Test  name 

December  1942 

September  1944 

battery 

battery 

9.35 

9.11 

Finger  Dexterity  . . 

.28 

.34 

This  factor  would  probably  have  failed  to  emerge  in  the  December 
1942  analysis  had  not  the  bombardier  criterion  been  included  in  the 
matrix.  The  bombardier  criterion  has  a  loading  of  0.43  in  the  factor.  It 
had  been  found  in  previous  estimations  that  a  number  of  tests,  includ¬ 
ing  the  two  listed  above,  and  Rotary  Pursuit,  had  bombardier  validities 
in  excess  of  the  amount  that  could  be  attributed  to"  other  known  fac¬ 
tors.  This  factor  sufficiently  accounts  for  the  remaining  validities  of  these 
tests.  It  has  been  called  the  bombardier  factor,  but  to  define  it  psycholog¬ 
ically,  it  seems  to  oe  a  psychomotor-precision  ability  of  some  kind.  Fur¬ 
ther  work  toward  the  improvement  of  selection  of  bombardiers  should 
stress  this  factor  very  heavily. 

Rotated  factor  IX  is  described  by  the  following  data  from  two  tests: 


Factor  loadings 

Test  name 

December  1942 

July  1943 

September  3944 

battery 

battery 

battery 

General  Information  (pilot  score)  ..... 

0.14 

0.32 

O.SS 

General  Information  (bombardier  ecorc) 

.33 

0  ••• 

•  ••• 

There  is  a  very  slender  basis  for  the  interpretation  of  this  factor,  but 
since  the  General  Information  tests  (pilot  score)  were  designed  particu¬ 
larly  to  measure  pilot  interest  and  since  the  pilot  criterion  has  a  load¬ 
ing  in  the  factor,  some  credence  may  well  be  given  to  the  hypothesis 
that  this  is  a  pilot-interest  factor.  The  fact  that  this  factor  has  small 
loadings  for  Complex  Coordination,  and  in  one  analysis  for  Spatial  Orien¬ 
tation  I  and  II,  three  tests  which  have  considerable  face  validity  for  pilot, 
lends  support  to  this  hypothesis.  If  the  hypothesis  is  correct,  it  would 
seem  that  the  bombardier  score  of  Technical  Vocabulary  and  Information 
missed  its  aim.  Pilot  interest  and  bombardier  interest  may  be  closely 
akin,  or  there  may  be  no  well  formed  entity  that  can  be  called  bombardier 
interest.  There  were  suggestions  that  this  factor  be  called  aviation  inter¬ 
est.  but  the  navigator  criterion  seems  to  have  little  or  no  communality 
with  it  (see  table  28.14). 

Rotated  factor  X  is  described  by  the  following  data  tnvolving  two 
tests  only : 
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Tot  name 

S  Factor  loading*  , 

')ectmb«f  1942 
battery 

July  1941 

L  battery 

September  1944 
battery 

ION 

•ON 

© 

0.48 

.26 

f 

0.47 

.28 

This  combination  of  tests  ard  these  loadings  coupled  with  other  ex¬ 
perience  force  us  to  accept  it  is  the  general-reasoning  factor  (Rt).  It 
comes  out  repeatedly  in  a  number  of  experimental  tests  not  represented 
in  these  analyses.  The  lower  loading  of  R,  for  Mathematics  B  in  the 
December  1942  analysis  is  probably  due  to  the  presence  of  the  Numerical 
Approximation  test,  which  seems  much  more  numerical  and  less  of  a 
reasoning  test,  at  least  by  inspection. 

Rotated  factor  XI  is  described  by  the  following  data: 


Factor  loadings 

Test  name 

July  190 
battery 

November  1941 
battery 

0.42 

0.50 

.17 

.16 

.19 

.  .31 

.27 

This  factor  might  be  called  a  navigation-interest  factor,  but  it  is 
doubtful  whether  such  interest  has  sufficiently  crystallized  in  the  young 
men  who  took  the  examinations  as  to  represent  an  entity.  In  view  of 
the  loading  in  Mathematics  A,  it  is  more  likely  to  represent  a  mathe¬ 
matical-background  factor.  One  hypothesis  considered  was  that  it  repre¬ 
sents  a  natural-science  education  factor,  but  the  absence  of  the  physics 
test  in  this  list  fairly  well  disproves  that  hypothesis.  The  factor  will 
therefore  be  called  mathematical  background. 

Rotated  factor  XII  is  described  by  two  tests  in  only  one  analysis: 

Fitter  loading* 

Ttrt  name  Nov .  194 S 

batttry 


Rudder  Control  . . . . . 0.S1 

Rotary  Pursuit  . 0.27 


This  factor  is  almost  entirely  confined  to  the  Rudder  Control  test 
which  is  relatively  pure  with  respect  to  it.  By  inspection,  the  Rudder 
Control  test  seems  to  involve  motor  coordination  controlled  mostly  by 
the  kinesthetic  sense.  The  best  hypothesis,  therefore,  seems  to  be  that  it 
is  a  kinesthetic-motor  factor.  There  is  no  doubt  of  the  unique  contribu¬ 
tion  of  this  factor  to  pilot  validity. 

It  should  be  atlded  that  this  factor  did  not  come  out  in  other  analyses 
(September  1944  battery  and  Carefulness  battery).  Instead,  the  Rudder 
Control  test  then  acquired  substantial  loading  in  the  psvchomotor-coor- 
dination  factor.  In  still  another  attempt  at  analysis  of  the  November 
1943  battery  combined  with  additional  tests,  the  kinesthetic  factor  again 
appeared,  and  the  test  had  a  zero  loading  for  the  coordination  factor. 
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The  existence  of  the  kinesthetic  factor,  therefore,  rests  upon  an  insecure 
basis  as  yet.  That  the  test  in  question  contains  much  valid  variance  not 
accounted  for  by  its  correlations  with  other  tests  cannot  be  questioned. 
That  some  or  all  of  this  is  a  kinesthetic  factor  still  needs  confirmation. 
Rotated  factor  XIII  is  described  by  two  tests  in  one  analysis: 


Tttt  MW 

Geography  , 
History  .... 


f«Mr  Im  dhfi 
JtJy  I94i 

I  inert 


058 

S2 


For  all  that  we  know  from  these  results,  this  could  be  a  doublet  con¬ 
fined  to  these  two  tests.  It  is  probably  safe  to  assume  that  it  is  a  more 
general  factor,  however,  and  the  hypothesis  is  ofTered  that  it  is  a  social- 
science  background  factor.  This  places  it  in  a  class  with  the  mathematical- 
background  factor  and  the  mechanical-experience  factor,  which  seem  to 
represent  variables  in  individual  differences  produced  by  learning.  This 
would  not  preclude,  however,  the  operation  in  each  of  them  of  inborn 
inclinations. 


Factor  Loadings  of  Composites 

Of  special  interest  in  the  July  1943  battery  are  the  factor  loadings  in 
the  composite  scores  or  stanines.  These  could  be  estimated  from  the 
loadings  of  factors  in  the  tests  that  enter  into  them  and  the  wc'ghts  as¬ 
signed  to  the  tests.  The  loadings,  presented  separately  in  table  28.13,  are 
undoubtedly  inflated,  as  was  said  before,  due  to  the  correlation  of  error 
variances,  but  their  relative  positions  are  probably  correct  From  table 
23.13  it  will  be  seen  that  the  leading  factor  in  the  bombardier  composite 
is  spatial  relations,  other  relatively  strong  ones  being  perceptual  speed, 
numerical,  and  mechanical  experience.  The  leading  factor  in  the  naviga¬ 
tor  composite  is  numerical,  with  substantial  loadings  in  verbal,  perceptual 
speed,  general  reasoning,  and  mathematical  background.  For  the  pilot 
composite,  psychomotor  coordination  leads  by  a  substantial  margin;  but 
the  composite  also  has  strong  loadings  in  pilot  interest,  mechanical  ex¬ 
perience,  and  prcccptual  speed.  The  officer-quality  composite  is  predom¬ 
inantly  verbal,  with  moderate  loadings  in  general  reasoning  and  mechan¬ 
ical  experience,  as  had  been  intended. 


Tails  28.11 —  Factor  leadings  of  four  comfetile  ttortt,  derived  from  the  Joky 

1913  battery  enalyrit 
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Validities  of  the  Factors 

The  validities  of  the  f;«tors  for  bombardier,  nav:  jator,  and  pilot 
training  arc  the  factor  !oa  t  ngs  in  the  three  training  criteria.  The  load-^ 
ings  for  the  most  commo.iy  recurring  factors  have  been  estimated  in 
three  ways,  one  of  which  .-.epended  upon  a  least-square  type  of  solution, 
one  upon  an  iterative  rr.uhod,  and  one  upon  a  direct  analysis  of  the 
December  1942  battery/  The  first  two  procedures  employed  data  based 
primarily  upon  the  July  i943  battery.  All  three  arrived  at  very  similar 
validities.  In  the  iteration  procedure,  no  factor  loadings  were  permitted 
to  be  so  large  that  test  validities  were  significantly  overestimated.  Validi¬ 
ties  were  permitted  to  be  fully  estimated  for  tests  in  which  the  factors 
considered  also  practically  account  for  the  nonchance  variance  of  the 
test.  Tor  other  tests  the  validity  was  permitted  to  be  grossly  underesti¬ 
mated,  if  necessary.  This  was  to  allow  for  the  existence  of  unknown 
valid  factors.  In  the  least-square  solution,  as  much  overprediction  was 
permitted  as  underprediction.  The  results  of  the  three  estimation  meth¬ 
ods  arc  given  in  table  28.14. 


Tail*  28.14. —  Estimated  factor  validities  for  bombardier,  navigator,  and  pilot 

training  criteria 

I  Uix-Kuart  aolution  I  Iterative  volution  I  Factor-analysis  rotation 


VerWt  .  0.09 

Perceptual  apecj . II 

Numerical . IS 

Spatial  relationa . 22 

Visual)  ration  . 

Mechanical  experience  .  .01 

P'ychoKWtor  coordination  .10 

Parchomolor  precision  .  . . 

Pilot  interest  .  —.11 

General  reasoning . 02 

Mathematical  background  —.02 


The  agreements  among  the  estimates  in  tabic  28.14  arc  generally  good, 
considering  the  facts  that  three  factors  were  not  in  common  throughout 
all  solutions,  that  somewhat  different  principles  and  procedures  underlie 
the  estimates,  and  that  somewhat  different  data  were  used  as  bases.  The 
major  discrepancies  are  worthy  of  comment.  The  small  verbal  and  per¬ 
ceptual  loadings  for  bombardier  found  by  the  first  two  methods  virtu¬ 
ally  dropjied  out  in  the  third,  and  those  for  numerical  and  spatial  rela¬ 
tions  became  greatly  reduced.  The  introduction  of  the  psychomotor  pre¬ 
cision  factor  in  the  December  1942  analysis  seems  to  have  been  at  the 
expense  of  s|M(ial  relations  loadings  in  some  tests  as  well  as  in  the  bom¬ 
bardier  criterion,  but  did  not  lower  variances  systematically  for  naviga¬ 
tor  or  bombardier  in  the  verlul,  oerce|>tual,  numerical,  and  spatial  rela¬ 
tions  factors.  \  isiiali/ation  seems  to  have  been  grossly  underestimated 


,  '  ^  procedure*  in  principle  m  repert  Ne.  J  •(  this  serin.  TV*  validity 

_'**  p««c*dut*a  wet*  tkt  umt  at  U«m  deKribed  in  connection  VI tk  Ike 
•nalyvev  •<  ike  December  mi  kattery  (n*v  page  Tf f>. 
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for  the  bombardier,  psychomotor  coordination  for  the  pilot,  and  general 
reasoning  for  the  navigator,  in  the  iterative  method  in  particular.  Further 
space  will  be  given  to  the  factor  composition  of  the  pilot  criterion  later 
in  this  chapter. 

A  comparison  of  tables  28.13  and  28.14  is  of  some  interest.  The 
weightings  of  tests  in  the  composites  had  been  derived  empirically, 
closely  in  accordance  with  the  principles  of  the  multiple-regression 
equation.  It  might  be  expected,  therefore,  that  the  relative  weights  ef¬ 
fective  for  the  factors  would  correspond  roughly  with  the  factor  load¬ 
ings  in  the  criteria.  This  proves  to  be  true,  with  one  or  two  notable  ex¬ 
ceptions  for  the  pilot.  Psychomotor  coordination  holds  fourth  rank  for 
factor  loading  in  the  pilot  criterion,  but  is  given  highest  weight  in  the 
pilot  composite.  This  would  indicate  an  overweighting  of  psychomotor 
tests  for  the  pilot  in  the  July  1943  stanine.  The  spatial-relations  factor 
has  either  first  or  second  rank  for  weight  in  the  pilot  criterion,  jbuitalds 
only  fifth  place  in  the  composite.  Better  estimates  of  the  factor  loading^ 
of  the  pilot  criterion  (sec  table  28.17)  do  not  alter  the  situation  just  de¬ 
scribed.  These  discrepancies  could  be  corrected  by  giving  psychomotor 
ter  s,  such  as  Rotaty  Pursuit,  less  weight,  and  giving  Dial  and  Table  Read¬ 
ing  more  weight.  An  even  better  solution  would  be  to  purify  the  tests. 
Too  many  tests  like  Complex  Coordination,  Discrimination  Reaction 
Time,  and  Two-Hand  Coordination  have  substantial  loadings  in  both  the 
psychomotor-coordination  and  spatial-relat>ans  factors  so  that  to  increase 
the  weight  of  one  is  to  increase  the  weight  of  the  other.  In  a  battery  of 
pure  tests,  there  is  much  more  freedom  of  action  in  arriving  at  optimal 
weights. 


A  SUMMARY  OF  FACTORIAL  RESULTS 
Two  Master  Table* 

For  the  convenience  of  the  reader,  factorial  results  have  been  sum¬ 
marized  so  as  to  present  the  picture  of  tests  in  clearest  form. 

One  summary  is  a  reference  list,  with  tests  given  in  alphabetical  order 
(see  appendix  B)  and  the  other  presents  tests  grouped  by  factors  (ta¬ 
ble  28.15).  The  latter  includes  all  tests  analyzed  as  reported  in  earlier 
chapters.  The  former  includes  only  the  printed  tests  that  are  the  primary 
subject  matter  of  this  volume.  Wherever  any  test  has  been  analyzed 
more  than  once,  a  weighted  mean  of  its  loadings  in  each  factor  has  been 
determined.  Three  analyses  have  been  on  'cd  from  consideration  in 
this,  the  November  1943  and  September  1944  battery  analyses,  and  the 
carefulness  jattery  analysis.  In  two  instances  the  failure  to  separate 
factor  R,  from  other  factors  left  some  uncertainty  as  to  other  aspects 
of  those  analyses.  The  September  1914  analysis  was  completed  too  late  to 
l>c  included.  F-ach. observed  loading  was  weighted  by  the  number  of  cases 
in  the  sample  analyzed.  It  is  recognized  both  that  there  is  no  precedent 
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for  the  procedure,  and  .  tat  inaccuracies  in  rotations  may  far  outweigh 
sampling  errors  in  som  .  instances.  On  the  other  hand,  repeated  estimates 
of  the  same  factor  Ion  .’mg  in  the  same  test  appear  to  behave  like  sam¬ 
pling  statistics,  ;.nd  to  Jiis  extent  the  procedure  seems  justified.  Wher¬ 
ever  vacancies  would  :.hervvise  have  occurred  in  either  summary  table, 
single  estimates  of  fac.or  loadings  from  the  November  1943  and  care¬ 
fulness  analyses  were  ised  if  they  did  not  come  from  the  doubtful  col¬ 
umns  of  loadings. 

In  both  summary  'ables,  one  set  of  loadings,  one  communality,  one 
estimate  of  reliability,  and  one  of  validity  for  each  criterion  have  been 
listed.  The  communality  is  the  sum  of  the  squares  of  loadings  as  listed. 

This  value  may  be  higher  than  that  found  in  any  single  analysis  because 
all  factors  in  the  test  did  not  necessarily  appear  in  all  analyses.  There  ; 
is  also  the  possibility  that  factor  loadings  from  different  analyses  are  | 
not  properly  identified  as  pertaining  to  one  and  the  same  factor,  though  j 
it  is  believed  that  this  possibility  is  rather  remote.  The  reliabilities  have  i 
been  determined  by  various  procedures:  Kuder-Richardson,  equivalent  ; 
halves,  odd-even,  or  part  I-part  II  intercorrelations  without  intervening  j 
time  intervals.  It  is  believed  that  the  best  type  to  compare  with  com-  j 
munalitics  is  th  fourth,  which  is  essentially  an  alternate- forms  proce-  | 
dure  without  .ntervening  time  interval.  This  method  was  most  com-  j 
monly  employed  in  the  program.  i 

If  for  any  test  /»*  is  equal  to  ru,  it  means  that  the  entire  nonerror  ‘ 
Variance  is  accounted  for  by  known  common  factors.  If  there  is  a  sub-  j 
stantial  positive  discrepancy  ( rit—h 7  >  0.00),  it  means  that  some  common  | 
factors  have  not  yet  been  brought  to  light  in  the  test.  A  substantial  nega-  j 
tivc  discrepancy  (rn— h3  <0.00),  would  mean  that  there  is  an  error  of 
estimation,  but  whether  in  the  derivation  of  r,j  or  of  A*  is  unknown.  In 
both  summaries  the  sizes  of  samples,  upon  which  factorial  results  on 
the  one  hand  and  validity  coefficients  on  the  other  are  based,  are  given  j 
in  two  separate  columns  under  the  symbols  N/  and  N„,  respectively.  ! 
Validities  for  pilot  (P),  bombardier  (B),  and  navigator  (N)  only  are  j 
given. 

The  more  meaningful  summary  (tabic  28.15)  is  arranged  by  factors,  j 
Each  test  is  placed  in  a  list  according  to  the  factor  in  which  it  has  its  j 
highest  loading.  In  each  list,  the  tests  are  arranged  in  descending  order 
of  loading  in  the  factor.  In  this  manner  one  can  decide  at  a  glance 
which  tests  are  purest  and  strongest  in  each  factor  and  which  ones  have 
similar  secondary  loadings.  In  a  few  instances,  tests  have  been  placed 
in  more  than  one  list  because  they  have  similar  high  loadings  in  two 
factors.  1  he  list  in  appendix  B  las  a  distinct  advantage  for,  those  who 
wish  to  look  lip  a  specific  test  by  name.  The  list  in  this  chapter  is  func¬ 
tionally  superior  for  those  who  wish  to  select  batteries  for  different 
purposes  or  who  wish  to  select  alternative  tests  by  equivalent  factorial 
configuration. 
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Key  io  the  Factors 

As  a  supplement  to  these  two  master  lists,  it  is  desirable  to  present 
here,  together  in  one  place,  definitions  of  factors  and  the  symbols  that 
are  used  to  designate  them.  This  list  is  alphabetical  for  the  sake  of 
easy  reference. 

Ca — The  carefulness  factor  occurred  in  tests  designed  as  carefulness 
tests,  but  curiously  enough,  and  after  all  reasonably  enough,  it  was 
strong  in  the  error  scores  rather  than  in  the  rights  scores, 

I, — The  first  integration  factor  is  common  to  tests  that  require  the 
effective  memory  of  a  number  of  rules  in  the  carrying  out  of  simple 
tasks  on  paper. 

I, — This  is  the  second  integration  factor  which  is  common  to  Fol¬ 
lowing  Directions  tests  and  others  in  which  mental  sets  change  frequently. 

Ij — The  third  integration  factor  seems  to  be  common  to  tests  in  which 
the  grasp  of  a  wide  variety  of  details  is  important. 

J— The  judgment  factor  is  found  in  tests  of  practical  judgment  and 
practical  estimations.  The  ability  probably  involves  making  wise  choices 
from  a  number  of  alternative  solutions  to  a  practical  problem.  It  seems 
to  be  a  judicial  or  criticizing  function. 

K — A  kinesthetic  factor  of  some  kind,  as  yet  it  is  fairly  specific  to 
the  rudder  control  test.  It  is  not  listed  in  table  28.15. 

LE — This  is  a  length-estimation  factor  involving  the  comparison  of 
lines  or  simple  distances  between  points.  It  may  involve  more  complex 
estimates  than  those  of  linear  dimensions. 

M|  (PM) — This  memory  factor  has  been  identified  as  paired-asso¬ 
ciates  memory.  It  is  involved  in  tasks  requiring  the  memorization  of 
items  in  pairs  and  is  evaluated  by  an  immediate  test  of  retention  and 
recognition. 

M,  (VM) — The  second  memory  factor  ;s  identified  as  a  visual-mem¬ 
ory  ability.  It  is  prominent  in  tests  requiring  the  retention  and  recall  of 
a  pictorial  stimulus  after  very  short  time  intervals.  The  length  of  time 
interval  may  be  an  irrelevant  condition  in  this  as  in  factor  Mj. 

M, — This  memory  factor  seems  to  be  restricted  to  memorizing  paired- 
associates  material  in  which  one  item  is  a  pictorial  symbol  and  the  other 
is  a  verbal  symbol. 

MB— Mathematical  background,  which  may  include  mathematical  in¬ 
terest  as  well  as  mathematical  training. 

ME— The  mechanical-experience  factor  is  most  heavily  and  purely 
weighted  in  the  Mechanical  Information  Test,  and  tests  of  Driving  Skill 
and  of  Tool  Function. 

I* — The  j>erceptual-speed  factor  involves  the  rapid  comparison  of 
visual  forms,  and  the  notation  of  similarities  and  differences  in  form 
and  detail. 
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TxdLK  28.1  b— Factor  loadings ,  eornmunalilies,  reliabilitiet,  and 

CAREFULNE8S 


Test  and  Cod#  No. 

Nr 

Co 

Plotting  Tret,  CE452A. 

(Total  Wrongs) 
Complei  Seal*  Read* 

3>l 

59 

•ng,CE454A . 

(Total  Wrongs) 
Plotting  Accuracy, 

(54 

57 

CE453A . 

(Total  Wrongs) 
Directional  Plotting, 

3et 

51 

CE455A . 

(Total  Wrongs) 

354 

41 

It 

Is 

Is 

J 

LE 

MB 

ME 

Mi 

PM 

Ms 

VM 

Ms 

N 

P 

PI 

00 

09 

22 

08 

INTEGRATION  I 


Signal  Interpret* lion, 

CieseAXS . 

Combat  Planes, 

CI6WAX5 . . 

Flight  Formations, 
C16MAX5 . 


200 

200 

200 


-02 

|-03 

-02 


OS 

22 

09 


_ 1 

_ _ J 

1 _ 1 

l _ l 

l _ 

: 

INTI 

1 _ 1 

;gpj 

ITIOI 

N  U 

Following  Directions, 
CP402A . 

Cod*  Analysis, 

CI053AX2 . 

208 

260 

.... 

11 

03 

54 

40 

10 

*42 

.... 

05 

00 

-03 

-07 

.... 

.... 

.... 

25 

29 

09 

08 

.... 

^  I 

1 - 1 

1 - 1 

1 - 1 

1 - 1 

r — i 

i - 1 

i: 

NTE( 

GRA' 
1 - 1 

now 

( III 

Planning  Air  Mancu- 
vera.  CI40SAX3. . . 
Cod*  Analysis, 

CIQ53AX2 . 

Planning  a  Course. 

CI406AX2 . 

Figure  Classification, 

C12I3AXI . 

Spatial  Reasoning, 
C12UUX1 . 


038 

200 

430 

202 

404 


00 

*iO 

IT 


-11 


43 
42 
41 
38 
38  [  14 


10 


-19 

00 

10 


20 

-07 

00 


02 


08 

29 

30 
05 
18 


00 

00 

05 

01 

10 


JUDGMENT 


Practical  Judgment  11, 
CI3011IX3 . 

170 

202 

202 

1713 

170 

372 

01 

45 

39 

38 

37 

34 

34 

32 

13 

00 

14 

33 

-02 

15 

15 

30 

-05 

(Work  Plan) 

Practical  Judgment, 
C1301HX1 . 

14 

00 

(Non-Mechanical) 
Beoueneeol  Maneuvers, 
CI4I0A . 

Judgment  (Pure)  Com- 
monKiue,  AAFQE 
JR  p-3 . 

Practical  Estimations 

i,  ciaosAXi . 

01 

33 

13 

15 

-05 

05 

Competitive  Planning, 

LENGTH  ESTIMATION 


Pattern  Assembly, 

CPMMA . 

(shorter  Path— Path 
Distance,  CPOOsB. . . 


Shorter  line— lane 
LetigOis,  CP000R.  . 


Nearest  Point— Point 
Distance,  CPHU7H. . 


Map  Dietance,  CP624II 


202 

545 

545 

545 

*58 


.... 

01 

.... 

-09 

.... 

02 

-01 

04 

It 

17 


—  11 

11 

23 

04 


17 

09 

-04 

! 

09 
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validities  for  tests  grouped  according  to  highest  factor  loadings * 

FACTOR  (Ca) 


PM.PMi 
PP  P8 

Ri 

QR 

lt> 

Ri 

SR 

01  .... 

10 

.... 

.... 

13 

-07  .... 

03 

.... 

.... 

13 

04  .... 

07 

.... 

.... 

00 

-03  .... 

02 

00 

13  ....  03 


P8 

V 

.... 

Zi 

39  77 


33  73 


•so  30  00 


_ Validity 

N«  Ipit  lBoa.lNa*. 


FACTOR  (It) 


IS 

41 

.... 

00 

.... 

.... 

.  ... 

17 

01 

00  .... 

2112 

*21 

20 

33 

17 

31 

-01 

70  07 

02 

22 

22 

10 

04 

43  04 

1303 

33 

FACTOR  (IQ 

....  10  ....  07  10 
....  10  ....  17  20 

FACTOR  (14 


13 

10 

00 

.... 

-00 

17 

11 

20  17  M  70 

23  08  M  M 


FACTOR  (J) 
00  .....  .... 


FACTOR  (LE) 


.... 

.... 

IS 

01 

03 

04 

i 

SI 

41 

17 

30 

48 

03 

•S3 

00 

SO 

40 

27 

08 

23 

23 

01 

07 

j 

30 

42 

03 

10 

48 

08 

0103 

to 

18 

20 

01 

-01 

32 

10 

.... 

.... 

.... 

17 

23 

11 

08 

40 

SO 

73 

80 

2270 

IS 

24 

.... 

.... 

•48 

.... 

.... 

.... 

10 

10 

04 

81 

877 

*17 

03 

10 

32 

00 

.... 

.... 

.... 

IS 

04 

30 

78 

2707 

00 

•4« 

OS 

38 

20 

.  •  •  . 

.... 

20 

10 

73 

83 

000 

1201 

•11 

4700  *13 

347  00 

740  17 

1233  *14 
483  *10 


d  II  41  H  Ml  ill 
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Table  28.15 
MATHEMATICAL  BACK 


Te*t  and  Code  No. 

Nt 

Ca 

!> 

1. 

Ii 

J 

LE 

MB 

ME 

pm’ 

M. 

V.M 

Mi 

N 

P 

PI 

Biocraphicai  Data, 
CEtXttD  .  . . 

3000 

' 

43 

-04 

10 

10 

-14 

(Navigator) 

Mathematic#  A. 
CI703P . 

3000 

37 

07 

*41 

05 

00 

MECHANICAL  EXPERI 


11 

77 

30 

01 

-00 

-03 

00 

07 

74 

-08 

00 

00 

...  . 

13 

-05 

14 

03 

-04 

04 

50 

03 

-07 

08 

-03 

17 

-13 

68 

03 

-13 

01 

36 

54 

07 

30 

-It 

51 

-03 

29 

so 

-20 

14 

03 

31 

46 

1.  .  . 

17 

00 

00 

45 

1 

-00 

43 

00 

-01 

43 

01 

01 

15 

13 

05 

05 

43 

35 

17 

41 

30 

-08 

17 

»3J 

• 

13 

30 

Tool  Function, 

CI'jOsA . 

Mnhmittl  Informa¬ 
tion,  C 1804 A . 


Mechanical  Principle#, 
C1U03A . 


Mechanical  Principle#. 

CI9Q3B . 

Practical  lednnt, 

CIJ01BXI . 

(Mechanical  llrnt) 
Physical  Principle#, 

ClnOlBX . 

Biocraphicai  Data. 

C Ett«D  (Pilot).... 

DrieiacSUR, 

C1307AXI . , 

Judgment  (Mechnakal) 
AAFQEJRP-4... 
Mechanical  Conete- 

bcruioa.  ACIOU . 

Mechanical  Compio- 
bcnaion,  AClOD. . . . 
Mechanical  Function*. 

CI907A . 

Judgment  lira*. 
AC10A,  AAPQE  JR 
r-3 . 

Technical  Vocabulary, 

CE303C  (Pilot) . 


Information  la  Jade- 
ant.  AAPQE  JR 

P-« . . . . 


IS) 

37*1 

7385 

203 

1300 

3000 

303 

1713 

153 

i 

570 

153 

1713 

3000 

1713 


PAIRED  ASSOCIATES 


Alan  Memory, 
CI30VAX3 . 

338 

176 

179 

336 

393 

393 

393 

14 

16 

00 

41 

68 

55 

54 

51 

M 

31 

35 

07 

35 

It 

35 

33 

33 

07 

35 

Map  M  unary. 

Cl  303  A  Xl. . 

Map  Memory, 
CIV03AXI . 

Map  Memory, 

CI503UX1 . 

05 

IS 

33 

35 

Plan.  Formation. 
CPMUB . 

09 

09 

04 

Directional  Qrleotn 
lion.  CP51SB . 

Pattern  Anaiyaia, 
CP513A . 

i 


i 


* 


■* 


! 


I 


1 

I 


T eet  ted  Cedt  No. 


N» 


... 

Plano-Nama  Memory, 

CI506AXI . 

2f;l 

Memory  for  Ued* 

Dftiki,  C1M0AXI. . 

— 

417 

Numerical  Operation*, 
CI701B  (Hack) .... 

Nemtrin]  Operetta)*, 
C170IB  (Float;.... 

Plot  tin*  Aeeuraey, 

CE463A . 

(Total  Iliiklal 

Nemerr<-aJ  Operation*, 
CI701B  , Total).... 

Mathematic*  B, 
C1200U.  C1706A. . . 


Dial  aad  Tablo  R  radio*.] 
CP631A.  CP622A 


Cob  plot  Seal#  Readl  n*J 
CtfMA  (Total  R)~ 
Malhemalta  A. 

CI702P . 

Ptatlia*  Toot,  CE4S2A 
(Total  ‘ 


Tablo  Readbg. 

CT6J1A . 


Malhematie*  B. 

CI200C . 

Number  Berta  Coat* 
pktioa.  C1216AX1. 
Directional  PMUm, 
CKtiiA  (Total  ft). 
Orsaaitalioaal  ITaa- 
ain*.  CM07AX1. . . 
Oriuiuuoiul  1’Ias- 
ainc.  CI407BXI... 


6386 

UM 

1M 

>73 


6000 


a&t 

aooo 

364 

103 


103 

1360 

303 

154 

303 

360 


Speed  of  IdeaUlcatioe. 
CP610A . 

HI 

(Noe-Rotated) 

Speed  of  Iiitnlilntne, 
ON  IGA  (Rotated) . . 

76ft 

Spatial  OrteaUUaa  I, 
CP501A . 

iss 

Spatial  Orieetatioa  1, 
CP50IB . 

7603 

Spatial  Ortaitatiea  II. 
0*5010 . 

6500 

P*rtiil-P*U  Traetac. 

CPSIJA . 

TOO 

Hip  Plaaaioc. 

Cltl-’AXl . 

170 

Block  Coaatia*, 
CP413A . . 

PUaolna  a  (Wilt, 

Cl  40 1A . 

Table  28.15 
MEMORY  III 


04 

19 

07 

01 

08 

03 

1 

1 

-03 

-00 

— 

81 

78 

M 

— 

04 

10 

08 

03 

B 

14 

-06 

66 

06 

07 

04 

04 

08 

13 

67 

-01 

— C2 

16 

-03 

S3 

31 

11 

06 

63 

*17 

07 

61 

06 

00 

33 

in 

■ 

OS 

n 

37 

0B 

■ 

10 

so 

30 

13 

04 

00 

16 

-01 

u 

I 

48 

-o: 

-07 

47 

05 

-0) 

■ 

i 

m 

■ 

44 

■ 

i 

13 

I 

■ 

20 

41 

11 

03 

36 

1 

m 

TO 

38 

m 

■i 

■a 

■ 

m 

PERCEPTUAL  SPEED 


I 


Tabus  28.15 
PERCEPTUAL  SPEED 


Tact  and  Coda  No. 

N/ 

Co 

I. 

It 

If 

J 

LE 

MB 

ME 

pJl 

M. 

VM 

Mi 

N 

P 

PI 

Aerial  PEotoarapLa, 
tll*90IA-IV . 

302 

06 

16 

27 

30 

Dcnxlina.  CI2I4AXI. . 
Decoding.  CI3I4AX2. . 

302 

00 

12 

36 

1000 

3 

-17 

31 

PILOT  INTEREST 


Hanoi  a*  Air  Mint*- 

cera/C  WOSAXl . . . . 
Rout*  Pine  run*. 

CI41IAXI . 

Ptaaaiac  Air  Moocu- 

rerc,  CI40*AX3 _ 

Plan  nine  a  Circuit, 

CltOIA . 

Practical  FjiiaalioM  I 

C130HAXI . 

Practical  Ja  dtpwaut. 

CUOIRXl . 

(Noa-MadkaaicaQ 


303 

06 

e» 

203 

170 

203 


17 

01 

IS 

06 

37 

•43 

01 

07] 

II 

-II 

10 

3 


-02 

-10 

.... 

00 

17 

06 

•41 

-081 

16 


PSYCIIOMOTOR  COORDI 


PantaiwIiM  Rut- 
UvaTiM.UVlIU 


Rw«(  tVctrril/. 
OJII6A . 


6000 


6000 


03 


03 


ON 


-0« 


-  — _ I 

i _ 1 

i _ i 

1 _ 1 

1 _ 1 

1 _ 1 

1 _ 1 

1 - 1 

PI 

1 _ 1 

ircr 

i _ i 

IOM< 

1 _ 1 

3TOI 

t  8P1 

EED 

La|  Baak  Accuracy. 

Marti**  Accuracy. 

XI . 

2M 

366 

.  .  . 

06 

-or 

06 

01 

60 

-07 

01 

-10 

-06 

-»t 

.... 

.... 

M 

03 

II 

U 

830 


( Continued) 
FACTOR  (ConttmtW) 


Tabls  28.18 

GENERAL  REASONING 


gaslit!  KMMAikC. 
C131IDX1 . 

Spatial  Viaualiiatio*  II. 

CI303AX1 . 

Ptetur*  loitgralio*. 


JtdiiMM  (RcMooiaa) 
AAFOL  JK  P-6. . . . 
Practical  II. 

CI30IOX3 . 


(Work  Plan) 
Competitive  nuaiai, 


Ni  I 

C* 

670  | 

... 

1 

W 

1713 

3399 

13 

163 

404 

303 

303 

1713 

170 

373 

303 

REASONING  II 


IHUDIBIHI 

BBjjiQBHEiflER 


lartnant  Coup**- 
bcnaoa  11.  CI6I6B.  MS  .... 
CompWi  Coordiaatiaa. 

CM70IA .  7S02  09 


FUmlif  t  Corn, 

CI40CAX7  .  439 

laelrumc  at  Cornpf*. 

brown*  I.  CI9I3.A. . .  443 

IWv  Firim.  Cirri*. 

CPJI2A .  397 

l)»*I  t!>4  T*hi*  Rfti- 
iac.CrulA.CiMTSA  9000 


DtMTUttl.iatio*  H»*»- 

Immi  Tun*.  CP9I10. .  6000 


Cab**.  CI'lllA  .  4iil 

OmU  K**a>*«.  Cr»73A  391 

Dinrtwoil  Oricala- 
Uo*.  Cl’JtWl .  »3 

T*»IU*d  CowAm. 
tM.  CM  101A .  4399 


PVrturr  iMKntlt*. 

CP104A .  391 

hn>  I  IrliMlM*  IS. 

CP3UA .  »3 

D«wbx.  CIJI4AX3  1800 


REASONING  III 


SPATIAL  RELATIONS 


31  ... 
37  ... 


#7  ... 

0*  0* 


(Continued) 
FACTOR  (Ri) 


Tabl^  28.15 


SPATIAL  II 


Tent  arid  Code  No. 

N 1 

r 

Co 

I> 

It 

J 

I.B 

MB 

ME 

Mi 

PM 

M. 

VM 

— 

Mi 

N 

P 

PI 

lUndf,  C1\>12A . 

658 

13 

~02 

06 

01 

13 

-05 

17 

13 

Flap*,  Figure*,  Card*, 
CP5I2A . 

392 

06 

15 

-05 

31 

_ 

SPATIAL  III 


Two-IUnd  Coordina¬ 
tion  PM  101 A 

0266 

8900 

554 

354 

-07 

—  08 

22 

-03 

-08 

-06 

00 

08 

09 

01 

40 

09 

00 

-02 

>61 

*u 

09 

10 

08 

10 

Rotary  Pu.nuit, 
CMW13V2 . 

Plot  tin*  Teat,  CE452A. 

(Total  Right*) 

Dr  mionnl  Plotting, 
cuttt 

. 

(Total  Right*) 

SOCIAL  SCIENCE  BACK 


1000 

1900 

01 

08 

12 

-08 

15 

08 

.... 

LJ 

L_ 

l 

_ 

VERBAL 


Technical  Vocabulary, 
CK505C  (Navigator) 

3638 

03 

03 

01 

06 

14 

10 

14 

10 

n 

00 

Vocabulary,  CI604B 

11 

• 

(AAK)  . 

1000 

153 

04 

-11 

Phyid™,  CI80IA . 

01 

21 

17 

Reading  Comprehen- 

202 

13 

33 

20 

-02 

Reading  Comprehen- 

570 

-01 

05 

20 

18 

-03 

08 

1  . 

1000 

08 

-08 

08 

1900 

01 

12 

15 

General  Information, 
CE5051)  (Navigator) 
Reading  Comprehen- 

20 

20 

3000 

27 

02 

10 

6372 

07 

04 

05 

37 

12 

03 

04 

Reading  Comprehen* 

10 

16 

20 

-01 

266 

22 

05 

04 

12 

-03 

Vocabulary.  AAFJE 

Jll  P-| . 

1713 

06 

10 

Memory  for  Tactical 

I'iana.  CI509AX.  .  .  . 

179 

10 

-12 

-02 

i  t  r  - 

Reawning  (In  Read- 

inz).  AAFQE  JIl 
P-10 . 

1713 

13 

00 

Mathematica  A* 
CI702K. . . 

42 

C7 

-12 

3000 

-04 

Technical  Vocabulary, 

CKS05C . 

0“ 

3000 

15 

04 

*33 

(Bombardier) 

Technical  Vocabulary, 

CKiv/iC  (Pilot) . 

3000 

*39 

-08 

17 

*3* 

Seoucnce  of  Maneuver* 
CUI0A . 

*33 

202 

00 

30 

00 

Math' malic*  A, 

CI70JK . 

3000 

*37 

07 

*il 

06 

00 

8yil»zi:in*.  AAFQE 

jit  r-n  . 

1713 

10 

07 

iea.  . 

Cen-raJ  Information, 

CKS05L)  (1'ilot)  . .  . 

3000 

00 

36 

-10 

23 

*33 

fUascnir.*  (Deductive) 
AAFQE  lit  P-6... 

1713 

08 

Oil  . . . 

■  a  — 

1 

* - 

.. 

834 


J •••  ■*  ’ 


{Continued) 
FACTOR  (Si) 


Table  28.15 

VISUALIZATION 


Teat  and  Coda  No. 

N/ 

0,1 

1 

i> 

It 

J 

LE 

MB 

ME 

M, 

P.\l 

Mi 

VM 

Mi 

N 

P 

PI 

Direction!)  Plotting, 

354 

202 

354 

206 

153 

7385 

202 

354 

1713 

1713 

202 

202 

658 

UI 

08 

03 

-13 

05 

(Totr.l  Wronga) 
Spatial  Viaua liiation  I, 
m?aiaYt 

18 

14 

01 

20 

37 

08 

24 

-02 

Mechanic*)  Principles. 
CI003II . 

1 

t 

-12 

04 

*  68 

04 

38 

*50 

t 

02 

.... 

Spatial  Viaualisation  1, 
CA2Q4XX2  .  ... 

01 

03 

34 

06 

-04 

-04 

Mechanical  Move* 

Mechanical  Principle*, 
C1003A . 

.... 

13 

-05 

14 

13 

03 

.... 

02 

.... 

-07 

-04 

*44 

Pa  turn  Com  prehen* 
•mn  PPKnlu 

Directional  Plotting, 
m.UA 

-03 

(Total  RighU) 
Mechanical  Compre- 
henaion  AAFQE  JR 

P-0 

05 

09 

21 

39 

22 

*4* 

Mechanical  Move- 
menu,  AAFQE  JR 

P-12 

Driving  8kili« 
rr.wAXi 

17 

17 

02 

08 

08 

01 

a  a  •  a 

a  a  a  a 

Spatial  Viaualitatlon  II. 
f!l203AXl 

17 

-00 

Map  DiaUnee,  CP628B 

.  ..  . 

oi 

a  .  a  a 

.  a  a  . 

*30 

.  .  .  . 

17 

a  a  a  • 

04 

a  a  a  a 

'Decimal  polnU  omitted. 

•Loadinri  in  italic,  indicate  that  a  teat  U  aleo  lilted  under  Mother  factor  in  which  tbia  londinc  ie  Wjh. 
•Derived  from  combination*  of  data  on  aimilar  forma. 
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1*1 — ‘ The  pilot-;  .‘crest  factor  is  common  to  the  pilot  criterion  and 
to  tests  designed  to  measure  pilot  interest. 

1*1  -This  is  a  pi  i  ining  factor  of  some  type,  so  called  because  it  is  in 
common  to  certain  planning  tests.  It  may  involve  visualization  of  a  cre¬ 
ative  type. 

I'M,  (PC)-I  'syehomotor  coordination.  This  factor  is  substantial  in 
all  psychomotor  tests  analyzed,  with  the  possible  exception  of  Rudder 
Control.  Whether  it  represents  eye-hand  coordination  or  integration  of 
muscular  movements,  or  both,  is  not  known.  It  seems  best  described  as 
general  muscular  agility. 

PM,  (I’P) — This  is  the  psychomotor-precision  factor,  heavily  weighted 
in  the  boml>ardier  criterion  and  in  psychomotor  tests  that  require  precise 
manipulations  under  speed  requirements. 

I'M*  (PS) — This  is  the  psychomotor-spccd  factor  restricted  to  two 
highly  similar  tests  that  require  simple,  rapid  movements  (marking  an 
answer  sheet). 

R,  (GR) — This  is  the  general-reasoning  factor  found  extensively  in 
most  reasoning  tests  and  strongly  in  Arithmetic  Reasoning. 

R, — The  second  reasoning  factor  is  hard  to  define.  It  is  quite  strong 
in  the  figure  analogies  test  and  other  tests  that  are  factorially  so  com¬ 
plex  that  the  dues  to  its  identity  are  difficult  to  isolate. 

R,  — The  third  reasoning  factor  is  strongest  in  tests  that  seem  to  re¬ 
quire  sequential  reasoning  and  in  which  frequently  one  can  arrive  at  the 
correct  answer  by  elimination  of  wrong  answers. 

S,  (SR) — This  is  the  spatial-relations  factor  which  seems  to  involve 
relating  different  stimuli  to  different  responses,  either  stimuli  or  re- 
sjxmses  being  ."'ranged  in  spatial  order.  It  is  not  clear  whether  the  ap¬ 
preciation  of  spatial  arrangement  of  stimuli  or  of  responses  separately 
is  the  key  to  the  factor. 

St — This  is  a  spatial  factor  restricted  to  a  few  tests  such  as  Thur- 
stoneV  hands  and  flags  tests.  An  appreciation  of  right-hand-left-hand 
discrimination  may  be  an  important  aspect  of  the  factor. 

S, — The  third  space  factor  was  found  in  only  one  analysis — that  of 
the  carefulness  battery.  The  nature  of  St  is  still  unknown. 

SS — The  social-science  background  factor  Iras  been  boldly  general¬ 
ized  from  its  strong  communal ity  in  History  and  Geography  examina¬ 
tions. 

V — The  verbal  factor  is  best  epitomized  by  vocabulary  tests  or  simple 
verbal-comprehension  tests. 

Vz — This  visualization  factor  is  strongest  in  tests  that  present  a  stim¬ 
ulus  either  pictomlly  or  verbally,  and  in  which  some  manipulation  or 
transformation  to  another  visual  arrangement  is  involved. 
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FACTOR  COMPOSITION  OF  THE  PII OT  CRITERION 


Validities  of  the  Factors 

Reference  was  made  earlier  to  estimates  of  validities  of  factors  that 
emerged  i^  analyses  of  the  classification  batteries.  Those  estimates  per¬ 
tained  to  less  than  half  the  entire  list  that  was  just  given.  It  is  of  ut¬ 
most  importance  that  we  examine  the  additional  factors  not  already 
represented  in  the  classification  battery,  for  in  them  lie  clues  to  unique 
variances  not  yet  covered.  If  any  of  them  arc  also  components  of  job 
criteria  and  can  be  shown  to  be  weignted  to  any  significant  degree  in 
those  criteria,  they  may  be  the  basis  I  r  increasing  the  validities  of  com¬ 
posite  aptitude  scores  to  a  practical  degree.  It  is  therefore  desirable  to 
extract  from  the  assembled  evidence  as  much  indication  as  we  can,  in¬ 
accurate  though  it  may  be,  of  the  loadings  for  pilot,  bombardier,  and 
navigator  criteria  in  the  various  new  factors.  Owing  to  the  lack  of  vali¬ 
dation  data  for  tests  for  bombardier  and  navigator  training,  we  shall  be 
restricted  to  the  pilot  criterion  in  this  study.  The  least  that  this  account 
docs  is  to  demonstrate  the  usefulness  of  a  limited  list  of  common  refer¬ 
ence  factors  in  describing  riteria  and  in  predicting  validities  for  tests 
that  have  never  been  validated  directly. 

A  basic  equation. — In  estimating  the  loadings  for  the  pilot  criterion, 
three  successive  steps  or  procedures  were  used.  Each  battery  of  tests 
analyzed  (excluding  the  classification  batteries)  served  as  a  basis  for 
the  first  two  steps.  The  basic  equation  that  gives  the  correlation  between 
two  variables  as  a  function  of  their  common-factor  loadings,  when  ap¬ 
plied  to  the  special  ease  of  pilot  validity,  reads: 

r„-a,Pi  +a,p,+atp,+ . 

in  which 

r€f=the  validity  of  test  A  for  pilot  training, 

a„  a,,  a, . a,=loadings  of  factors  1,  2,  3,. .  .*  in  test  A, 

and 

pu  pt,  Pi . ^*=loadings  of  factors  1,  2,  3 in  the  pilot 

criterion. 

For  each  test  whose  factor  loadings  and  pilot  validity  arc  known,  there 
is  one  such  equation,  in  which  pt,  p7,  p3,  etc.,  arc  unknown.  There  are 
as  many  equations  as  there  arc  tests  and  as  many  unknowns  as  there 
arc  factors.  The  solution  of  these  simultaneous  equations  yields  esti¬ 
mates  of  Pu  Pt,  Pi  ...  Pm- 

Solution  of  the  estimates  —  The  first  step  was  to  estimate  the  loadings 
Pi.  Pi.  Pi.  etc.,  by  a  least-square  solution  The  number  of  unknowns  in 
some  sets  of  equations  was  reduced  by  uming  that  the  loadings  for 
the  verbal,  numerical,  and  general-reasoning  factors  were  zero.  In  still 
other  instances,  all  of  which  will  be  noted  in  table  28.16,  other  loadings, 
principally  for  factors  ME  and  PM,  were  assumed  and  corresponding 
allowances  made  for  them  in  the  equations.  These  assumed  loadings 
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were  taken  from  the  Dec  - nbcr  1942  battery  analysis  (see  table  28.14). 
After  having  thus  reciuc :  1  the  number  of  unknowns,  the  number  of 
equations  was  then  rtducd  to  the  number  of  unknowns  by  computing 
sums  of  squares  of  the  deviations  of  loadings  in  each  column  from  the 
mean  of  the  column,  and  the  sums  of  cross  products  between  all  pos¬ 
sible  pairs  of  columns.*  These  equations  were  then  solved  simultane¬ 
ously  by  a  modified  Doolittle  method. 

The  second  step  consisted  in  minor  adjustments  in  the  solutions  just 
described.  It  was  recognized  that  not  all  valid  factors  were  represented 
in  each  set  of  equations.  Accordingly,  one  might  expect  that  the  basic 
equation  when  applied  to  incomplete  data  should  frequently  yield  under- 
estimations  of  validities,  but  should  rarely  give  overpredictions.  In  this 
step,  by  trial  and  error,  adjustments,  usually  downward,  were  made  in 
the  estimated  values  for  pu  pt,  etc.,  until  a  more  reasonable  fit,  in 
accordance  with  the  principles  just  stated,  was  accomplished. 

The  results  from  steps  one  and  two  are  given  in  table  28.16.  A  blank 
in  the  tabic  indicates  that  a  factor  did  not  appear  in  an  analysis.  As¬ 
sumed  loadings  are  indicated  by  a  superscript  **1.”  For  some  factors, 
only  one  estimate  is  available,  and  for  others  a  number  of  estimates. 
The  stability  of  any  loading,  or  lack  of  it,  can  be  seen  in  the  rows  of  the 
table.  The  concordance  of  many  estimates  with  previous  estimates  may 
also  be  noted.  The  agreement  is  generally  reassuring,  though  consider¬ 
able  variation  from  sample  to  sample  is  obvious.  Of  the  seven  factors 
whose  loadings  had  been  previously  estimated  (see  table  28.14),  judging 
from  the  mean  loadings  given  in  the  next  to  the  last  column,  the  validities 
of  the  mechanical-experience  and  psychomotor-coordination  factors  had 
been  somewhat  underestimated  in  the  December  I?42  analysis,  and  those 
for  perceptual  speed,  spatial  relations,  and  visualization,  had  been  some¬ 
what  overestimated.  In  the  last  column  are  given  estimates  rounded  to  the 
nearest  0.05,  which  is  taken  to  be  the  limit  of  accuracy  probably  justified 
here  for  most  factors. 

The  third  step  was  a  checking  method  which  cut  across  factor  batteries. 
Assuming  the  loadings  in  the  last  column  of  table  28.16  to  be  correct, 
each  factor  was  taken  in  turn  as  the  "unknown”  or  X  factor  whose 
validity  was  to  be  determined.  AH  tests  in  table  28.15  in  which  factor  X 
had  a  substantial  loading  were  examined  in  the  following  manner.  No  test 
was  included  unless  its  loading  in  X  was  equal  to  or  grea'er  than  0.30. 
Using  the  fundamental  equation  as  the  basis,  the  part  validity  of  each 
test  (omitting  factor  X)  was  estimated  and  a  residual  validity  which 
could  probably  be  attributed  to  factor  X  was  computed.  The  division  of 
the  residual  by  the  landing  of  factor  X  in  the  test  gave  one  estimate  of 
the  validity  of  factor  X.  There  were  as  many  estimates  of  this  validity 
as  there  were  tests  involved.  A  mean  of  these  estimates  was  taken  to  be 
the  best  value.  In  some  instances,  as  with  the  three  integration  factors 
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which  occurred  combined  so  frequently  in  the  same  tests,  more  than  one 
unknown  was  assumed.  In  these  instances,  simultaneous  equations  with 
two  or  three  unknowns  were  solved,  using  the  residual  validities  as  before. 

The  factor  loadings  in  the  pilot  criterion. — This  procedure  (step  3) 
led  to  verification  of  most  of  the  previous  estimates  given  in  the  last 
column  of  table  28.16.  The  revised  estimates  are  presented  in  table  28.17, 
with  all  factors  except  K  (kinesthetic)  accounted  for.  Also  presented 
are  the  numbers  of  batteries  (n)  in  which  estimates  had  been  made,  in*  . 
eluding  the  two  classification  batteries — July  1943  and  December  1942 — 
as  well  as  the  batteries  listed  in  table  28.16.  The  checking  procedure  of 
step  3  is  not  counted  as  one  of  the  observations  in  deriving  n. 

Comparison  of  the  two  tables  wilt  show  a  number  of  changes  brought 
about  by  step  3.  The  most  drastic  ones  are  for  Reasoning  II  and 
Reasoning  III  whose  loadings  were  reduced  from  0.15  and  —0.20  to 
0.05  and  —0.05  respectively.  The  loading  of  0.22  for  R,  found  in  the 
judgment-and-reasoning  battery  is  suspect,  because  it  was  not  dear  that 
the  R,  in  that  battery  was  one  factor  or  a  combination  of  two  factors. 
The  loading  of  —0.21  for  R*  is  unlikely.  One  would  not  expect  a  negative 
loading  of  any  substantial  size  in  any  factor,  though  loadings  of  —0.05  , 
and  —0.10  have  been  accepted  in  other  factors  than  Rj.  In  checking 
(step  3),  the  loadings  were  estimated  for  R*  and  Ri  simultaneously.  The 
least-square  solution  (in  connection  with  step  3)  gave  loadings  of  0.15 
for  Ri  and  —0.05  for  R,.  Certain  residuals  indicated  the  reduction  to 
0.05  for  R»,  however,  and  verified  the  loading  of  —0.05  for  R». 

The  validity  of  0.00  previously  estimated  for  Integration  II,  becomes 
0.10  in  table  25.17.  PM,  loses  weight  from  its  unparalleled  0.30  in  table 
28.16  to  a  value  of  0.20.  Length  estimation  has  a  loss  from  0.20  to  a 
more  conservative  0.15.  M,  insists  upon  having  a  validity  of  0.05  in  the 
final  checking  and  M,  gains  from  0.10  to  0.15.  PI.  which  had  two  esti¬ 
mates  of  0.03  and  0.04  but  a  third  of  0.00,  gains  a  place  in  the  nonzero 
class  with  a  validity  of  0.05.  S,  curiously  enough  shifts  from  +0.05  to 
-0.05.  With  so  few  tests  as  we  have  to  represent  it,  we  can  only  be 
puzzled  by  this  result  and  wait  for  further  confirmation  one  way  or  the 
other.  Loadings  that  remain  the  same  through  the  checking  procedure  of 
step  3  were  factors  I,,  I»,  J,  ME,  M,(  N,  P,  PM,,  R,,  S,,  Si,  V,  and  Vt. 
Most  loadings  are  left  in  the  rougher  values  to  the  nearest  five  hundredth. 
Two  of  the  stronger  and  better  established  loadings,  however,  are  given 
to  the  nearest  one  hundredth,  namely,  for  factors  ME  and  S,. 

Negative  factor  validities— Negative  loadings  present  something  of  a 
question.  A  negative  validity  would  mean  that  individuals  having  a  high 
degree  of  a  factor  do  less  well  on  the  average  than  individuals  who  have 
average  or  low  degrees  of  it.  The  ability  may  actually  be  regarded  as  a 
handicap.  On  this  basis,  most  of  the  negative  validities  in  table  28.17 
can  be  successfully  rationalized.  The  factors  in  which  the  negative  load¬ 
ings  occur  for  the  pilot  t.nd  to  be  abilities  involving  either  tedious,  re- 
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Table  28.17 .—  Final  estimates  of  factor  validities  for  the  pitot  criterion 


Factor 
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Loading 

|t  . . 
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0.2S 
.10 
—  .10 
.10 
.IS 
.10 
.27 

ii . 

| 

i,  . . . 

1 

. . 

MB  . . . 

i 

me  . 

i 

if,  (PM)  . 

2 

.OS 

.OS 

Mi  (VM)  . 

4 

1 

.IS 

i i . 

2 

.00 

p  . 

II 

.IS 

pi  . 

2 

.2$ 

Factor 


PI  . 

PMi (PC) 
PMi  (PP) 
PMi  (RS) 

R.  (OR) 

Rt  . 

Ra  . 

Si  (SR)  .. 

S.  . 

S»  .. . 

ss . 

V  . 

Vc . 


Loading 


o.es 

.20 

.00 

(*> 

.00 

.OS 

-.OS 

.12 

.0) 

-.01 

-.10 

-.01 

.20 


*  Wo  estimate  made  for  lack  of  validity  data. 


striding,  or  symbolic  activity,  or  combinations  (hereof.  The  verbal  factor 
is  highly  symbolic.  I*  involves  attention  to  many  details.  Rj  seems  to  be 
reasoning  where  fine  distinctions  arc  important.  The  nature  of  Sj  is  not 
sufficiently  known  to  justify  any  opinion,  nor  is  its  negative  loading  to  be 
fully  accepted.  The  social-science  factor  prolably  represents  a  negative 
interest  in  symbolic,  indoor  activities,  for  pilots  have  a  positive  interest 
in  overt,  outdoor  activities. 

Validity  of  Factor  Composites 

From  the  data  in  table  23.17  wc  can  derive  an  estimate  of  the  maximum 
validity  to  be  expected  for  pilot  selection  on  the  basis  of  factor  validities, 
when  factors  arc  optimally  weighted  in  a  composite  score  in  proportion 
to  those  validities.  In  this  discussion,  orthogonality  of  factors  is  assumed 
In  the  last  classification  battery  in  use  in  the  Air  Corps  during  the  year 
1945,  of  the  factors  listed  in  table  28.17,  only  J,  MB,  ME,  P,  PI,  PM»» 
S„  and  V?  have  positive  loadings.  A  sum  of  the  squares  of  their  validities 
yields  an  estimate  of  the  maximum  possible  multiple  correlation  squared 
(RJ).  Considering  these  factors  only,  the  sum  of  squares  is  0.3603.  and 
from  this  R  is  0.60.  This  probably  coincides  fairly  well  with  empirical 
results,  except  that  there  is  probably  souk*  reduction  in  the  obtained 
multiple  R  due  to  the  weighting  of  irrelevant  factors  like  N  and  R„  and 
the  positive  weighting  of  V  when  it  should  have  a  negative  weight.  These 
irrelevant  variances  arc  introduced  by  the  use  of  impure  tests  like  Arith¬ 
metic  Reasoning,  Reading  Comprcnension,  and  Dial  and  Table  Reading. 
Othei  discrepancies  would  be  due  to  the  fact  that  the  MB  factor  is  not 
weighted  In  the  composite  and  that  the  K  factor  is  weighted  Imt  not 
represented  in  the  list  considered  above.  The  unique  contribution  of  the 
Rudder  Control  test  is  accordingly  missing  from  all  the  estimates  o 
the  multiple  R  made  here. 

It  is  of  interest  next  to  see  what  pilot  validities  could  be  obtained  if 
all  valid  factors  were  included  in  the  composite  and  optimally  weighted. 
Consider  first  only  the  factors  with  positive  weights.  The  estimate  of  R 
is  then  0.4903,  and  R  is  0.70.  which  represents  a  gain  of  0.10  in  nuxunum 
validity.  This  could  be  accomplished  at  the  cost  of  adding  a  maximum  of 
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nine  tests  to  the  battery,  one  for  each  additional  factor.  If  impure  tests 
were  used  for  this  purpose,  properly  combining  these  factors,  less  than 
nine  tests  would  be  needed.  If  negatively  weighted  factors  were  also 
included,  R*  would  become  0.5178  and  R  would  be  0.72.  This  gain  of 
0.02  would  be  at  the  cost  of  adding  four  tests  (the  verbal  factor  being 
already  covered).  The  cost  in  this  case  might  be  too  great  for  the  gain 
obtained.  There  is  also  the  general  question  of  the  policy  of  weighting  any 
ability  negatively  for  fear  it  might  mean  adverse  selection  for  some 
other  criterion  in  the  job.  Selection  of  tests  that  are  to  contribute  negative 
weights  should  always  be  made  with  care. 

What  has  just  been  done  for  pilot  tests  and  factors  could  also  be  done 
for  navigator  and  bombardier.  Owing  to  the  lack  of  data,  it  has  been 
impossible  to  make  satisfying  estimates  for  navigator  and  bombardier 
criteria  except  for  factors  in  the  classification  battery  (see  table  28.14). 
Referring  to  the  data  of  that  table,  particularly  data  for  the  factor- 
analysis  solution,  we  can  see  what  the  December  1942  battery  validities 
could  have  been.  Summing  and  squaring  the  columns  and  extracting 
the  square  roots  yield  estimates  of  maximum  validities  of  0.50,  0.74  and 
0.63  for  bombardier,  navigator,  and  pilot  composites  respectively.  The 
empirical  validities  fell  somewhat  short  of  these  values;  ir.  fact,  they 
were  in  the  neighborhood  of  G.30,  0.65,  and  0.53  respectively.  The  dis¬ 
crepancies  can  be  attributed  to  improper  weighting  and  possibly  to  over¬ 
estimates  of  factor  validities  in  some  instances. 

Job  Analysis  by  Factor  Analysis 

In  chapter  1  it  was  intimated  that  job-analysis  information  would  be 
forthcoming  in  later  chapters  based  upon  factor-analysis  results.  It  is 
time  now  to  make  good  that  promise  and  to  see  how  the  empirically 
determined  factors  agree  with  the  traits  derived  from  direct  inspection 
of  air-crew  jobs. 

Inspection  categories  versus  factor  categories  —  -Of  the  list  of  20  traits 
used  in  describing  the  job  of  the  pilot  m  table  1.5,  very  few  concepts 
ha\c  their  counterparts  in  the  factor  list  in  this  chapter,  and  none  of 
these  has  identical  meaning.  Judgment  in  fable  1.5  has  quite  a  different 
connotation  than  the  J  factor  in  table  28.17.  Foresight  and  planning  of 
table  1.5  has  only  one  factor  resembling  it — the  factor  doubtfully  called 
Tlanning  (Pi).  Memory  breaks  down  into  at  least  three  separate  abilities. 
Comprehension  is  replaced  by  the  verbal  factor  which  is  restricted  to 
word  material. 

Visualization  of  flight  course  may  mean  nothing  more  than  the  visuali¬ 
zation  factor  when  that  term  is  applied  specifically  to  aviation  by  aviation 
observers,  but  this  is.  very  doubtful.  F.stimation  of  speed  and  distance 
must  indeed  be  separated,  for  distance- judgment  tests  correlate  very 
low  with  speed-judgment  tests.  A  length-estimation  factor  was  found, 
and  it  is  possible  there  is  also  a  speed-estimation  fnctor,  but  none  was 
uncovered  by  correlation  analyses.  Sense  of  stislenta.tion  may  be  akin  to 
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the  kinesthetic  factor  found  in  the  Rudder  Control  test,  but  no  evidence 
is  available  regarding  this  possible  connection.  "Division  of  Attention” 
did  not  appear  as  a  distinct  factor,  but  one  or  more  of  the  integration 
factors  may  be  identified  with  it.  Orientation  almost  surely  breaks  down, 
but  its  components  are  not  clear.  It  is  doubtful  that  what  aviation  ob¬ 
servers  call  orientation  depends  to  any  great  extent  on  the  Sj  factor.  It 
may  depend  to  some  extent  on  S,.  A  factor  concerning  compass  direc¬ 
tions  might  have  been  expected,  but  none  has  as  yet  made  its  appearance. 
Speed  of  decision  and  reaction  is  a  superficial  concept.  The  test  designed 
to  measure  it — Discrimination  Reaction  Tunc— proved  to  break  down 
into  Si,  PM,  and  P  factors.  Factor  PM,  is  one  type  of  speed  of  reaction, 
but  its  loading  in  the  Discrimination  Reaction  Time  test  is  as  yet 
problematical. 

Coordination  is  an  ambiguous  term,  but  in  tabic  1.5  it  probably  refers 
to  purely  motor  coordination  or  psychomotor  coordination.  If  so,  the 
PMt  factor  is  very  dose  to  it  but  there  is  evidence,  not  given  in  this 
volume,  that  there  is  also  a  motor  coordination  factor  at  the  finger- 
dexterity  level. 

From  this  point  on  in  table  1.5,  the  traits  show  less  and  less  corre¬ 
spondence  with  the  empirical  factors.  One  reason  is  the  lack  of  factor- 
analysis  data  in  the  area  of  temperament.  The  only  temperament  factor 
to  be  found  listed  in  table  28.17  is  that  of  pilot  interest;  all  others  are 
apparently  abilities  with  the  exceptions  of  mathematical  background, 
mechanical  experience,  and  the  social-science  factor,  which  are  app..»cntly 
experience  variables. 

Factors  in  table  28. 17  r.ot  mentioned  in  table  1.5  are  the  integration 
factors,  mathematical  background,  mechanical  experience,  numerical,  the 
three  reasoning  factors,  and  the  social-science  factor.  Of  these  only 
mechanic..".  experience  and  possibly  integration  i  have  substantial  validity 
for  the  pilot.  Even  so,  these  two  were  missed  in  the  list  of  table  1.5. 
Two  others  had  been  anticipated  in  connection  with  navigation,  namely, 
mathematical  background  and  the  numerical  factor. 

Factor  categories  as  an  aid  to  inspection. — Very  few  factors,  as  was 
said  before,  were  exactly  what  had  been  expected.  Having  brought 
them  to  light,  however,  we  can  now  ask  whether  future  observations  of 
jobs  would  be  facilitated  by  their  use  as  compared  with  the  trait  cate¬ 
gories  originally  employed.  It  is  quite  possible  that  this  is  so.  As  psychol¬ 
ogists  who  worked  with  these  factors  became  more  and  more  familiar 
with  them,  it  became  natural  and  apparently  easy  to  examine  a  new  test  or 
a  new  task  and  to  predict  with  some  confidence  the  factors  probably 
present  and  their  relative  importance.  It  is  believed  that  with  increased 
definition  of  variables,  this  type  of  analysis  bv  inspection  will  take  on 
greater  utility  than  has  heretofore  been  possible  with  the  use  of  old 
concepts.  The  stability  (reproducibility)  of  the  same  variables  and  their 
communicability  lend  great  assurance  that  this  is  true. 
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THE  PREDICTION  OF  TEST  VALIDITIES 


One  virtue  of  factor  theory  which  has  as  yet  been  mentioned  only 
incidentally  is  its  utility  in  the  prediction  of  validities  of  tests.  This  is 
possible  by  means  of  the  basic  equation  given  on  page  839,  if  enough 
of  the  factor  loadings  are  known.  Assuming  that  both  new  test  and 
criterion  have  been  fully  accounted  for  in  terms  of  known  factors,  the 
prediction  of  validity  of  the  test  scores  can  be  made.  If  one  or  more 
factors  a*-c  unaccounted  for,  the  prediction  will  fall  she: l  of  ihe  validity 
to  be  obtained,  but  it  will  at  least  give  a  notion  of  the  minimum  validity 
to  be  expected  if  all  known  loadings  are  positive. 

In  table  28.18  arc  given  the  predicted  and  obtained  pilot  validities  of 
all  tests  that  have  been  analyzed  and  for  the  description  of  which  this 
volume  is  primarily  responsible.  Since  the  factor  loadings  used  for  the 
predictions  were  derived  from  most  of  the  same  tests,  the  results  are 
in  a  large  sense  merely  a  verification  of  the  fit  of  the  factor  loadings 
estimated.  The  discrepancies  between  predicted  and  obtained  validities 
indicate  the  degree  of  divergence  between  assumption  and  fact.* 

There  should  be  little  concern  regarding  positive  discrepancies,  par¬ 
ticularly  in  the  case  of  tests  whose  communalitics  fall  short  of  their 
respective  reliabilities.  The  differences  are  probably  due  to  unknown 
common  factors.  If  a  discrepancy  of  this  kind  is  fairly  large,  the  test 
deserves  thorough  scrutiny  for  hypotheses  as  to  the  unknown  factor  or 
factors  and  a  research  project  to  define  better  the  factor  and  to  maximize 
its  variance  in  new  tests.  Tests  in  this  category  arc  Competitive  Planning 
(deficit  validity  of  0.14),  Mechanical  Functions  (deficit  of  0.14),  Memory 
for  Tactical  Plans  (deficit  of  0.13),  Number  Scries  Completion  (deficit  ot 
0.09),  and  Pursuit  (deficit  of  0.12).  In  all  of  these  tests  there  is  plenty 
of  room  for  common- factor  variance  not  now  accounted  for.  Granted 
that  these  deficits  arc  not  due  to  sampling  errors,  these  tests  deserve 
future  intensive  study. 

large  negative  discrepancies  are  disturbing  for  they  show  that  the 
obtained  validity  is  more  than  accounted  for  by  known  factors.  In  one 
or  two  instances  such  discrepancies  led  to  a  reexamination  of  the  ob¬ 
tained  validities  and  as  a  result  gross  errors  were  found  and  corrected. 
In  other  instances  we  may  attribute  the  discrepancies  to  sampling  errors, 
if  N  is  relatively  small.  Experience  has  shown  that  with  samples  of  200 
pilots,  validities  fluctuate  all  the  way  from  0.20  to  0.75  (when  the  mean 
is  about  0.50).  With  large  samples,  one  might  well  suspect  errors  in  the 
estimation  of  factor  loadings  in  the  test  or  in  the  criterion. 

Within  the  group  of  tests  having  serious  over-predictions  of  validity 
are  the  following:  Judgment  of  Proportions  (excess  validity  of  0.10), 
Mechanical  Movements  (excess  of  0.11),  Practical  Judgmerit  (mechani¬ 
cal)  (excess  of  0.10),  Sequence  of  Maneuvers  (excess  of  0.10),  and  Tool 

•  A  really  crucial  lest,  of  course,  would  be  to  predict  validities  in  advance  and  compare  pre* 
dictioni  with  validities  obtained  in  new  samples.  The  obtained  validities  in  table  23.18  are  the 
aame  at  those  given  in  table  28.1S. 
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Table  28.18. —  Estimated  and  obtained  pilot  validities  of  analysed  tests  described 

in  this  volume 
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t* 
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.24 
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Tabic  28  .  s.-  (Cont’d) 


Test 


llottu.g  Test,  CE4S2A  (Total  R) .  . ... 

(’lotting  Ti»t,  CE4S2A  (Total  \V)  . . . 

('lotting  Accuracy,  CE453A  (Total  R)  ........ 

Plotting  Accuracy,  CE4S3A  (Total  W)  . 

Practical  Eitimationa  1,  CI308AX1  ....  ...... 

Practical  Eatintation*  II,  CI308AX1  . . 

Practical  Judgment  (Mechanical  Items) 

Cl  30 1  H  X 1  . . . 

Practical  Judgment  (Non-Mechanical)  CI301BX1 
Practical  Judgment  I  (Non-Mechanical) 

C1101UXJ  . 

Practical  Judgment  II  (Work  Planning), 

cuoinxj  . 

Pumuit-Path  Tracing,  CPSI2A  . 

Reading  Comprehension,  CI614G  . . 

Reading  Comprehension,  CI614H  . . 

Route  Planning,  CI411AX1  . . . . 

Sequence  of  Maneuvers,  CI410A  . . 

Shorter  I.ine — Line  Length,  CP6Q6B  . 

Shorter  Path — Path  Distance,  CP608B  . . 

Signal  Interpretation,  CI6S6AX2  . 

Spatial  Orientation  I.  CPSOIA  . . . 

Spatial  Orientation  1,  CPSOlB  . . 

Spatial  Orientation  If,  CPSOlB  . 

Spatial  Reasoning,  CI2IIRXI  . . . 

Spatial  Visualisation  I,  C1204AX1  . . 

Spatial  Visualisation  I,  CI204AX2  . . 

Spatial  Visualisation  II,  C1203AX1  . . . 

Speed  of  Identification  (Non-Rotated), 

CP610A  . . . . 

Speed  of  Identification  (Rotated),  CP610A  .... 

Tool  Function,  CI906A . . . 

Vocabulary,  CI604B  . 
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—.03 
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.ii 

.17 

.06 

3.616 
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.24 

-.01 
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.18 

.21 

.01 

2.112 

.18 

.17 

-.01 

3.965 

.23 

.20 

-.03 

10,925 

.24 

.26 

.02 

10,925 

.10 

.11 

.01 

999 

.19 

.15 

-.04 

1,445 

.15 

.12 

-.03 

1,685 

.12 

.17 

.05 

3.088 

.14 

*•«» 

.23 

.18 

-.05 

10,925 

.8* 

.17 

-.14 

78 

-.03 

-.08 

-.05 

1,662 

Function  (excess  of  0.14),  Two  tests  in  this  group  have  validities  based 
on  relatively  small  samples — Sequence  of  Maneuvers  (N  =  247)  and 
Tool  Function  (N  —  78).  Other  validities  are  based  on  large  samples.  It 
is  notable  that  with  one  or  two  exceptions,  the  tests  in  this  group  have 
cotninunaiitics  which  approach  the  reliabilities  fairly  closely,  in  contrast 
to  tests  in  the  group  mentioned  in  the  last  paragraph. 

The  general  picture  of  the  goodness  of  fit  of  predicted  validities  is 
given  in  table  28.19.  Nearly  half  of  the  discrepancies  are  within  a  range 
of  ±0.02  of  zero  and  about  two-thirds  arc  zero  or  above.  The  mean 
discrepancy  is  only  0.01,  which  is  lower  than  would  have  been  expected. 
s  This  fact  may  indicate  that  there  has  been  some  tendency  to  over¬ 
estimate  the  factor  loadings  of  the  pilot  criterion.  It  might  also  mean  that 
the  list  of  factors  mentioned  in  this  chapter  goes  much  further  toward 
covering  the  pilot  criterion  than  might  be  supposed.  The  coefficient  of 
correlation  between  predicted  and  obtained  validities  is  0.81,  which  indi¬ 
cates  considerable  agreement. 

Table  28.19.—  Distribution  of  discrepancies  between  predicted  and  obtained  pilot 

validities  in  90  Tests 


Ditcrepancy 

Frequency 

Discrepancy 

Frequency 

40.11  to  4Q.17  . 

4.08  to  4.12  . 

4.03  to  4.07  . 

3 

•  7 

21 

41 

13 

-0.12  to  -0.08  . 

-.17  to  -.11  . 

N  . 

4 

1 

90 

.011 

-.02  to  4.02  . 

M  . . . 

—.07  to  —.03  . 
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Validities  of  Non-Validated  Tests 

The  predicted  validities  of  the  small  number  of  tests  in  table  28.18 
that  have  not  been  validated  do  not  reveal  any  spectacular  results,  but 
they  do  provide  some  guesses  as  to  the  probable  usefulness  of  certain 
tests.10  In  general,  the  carefulness  tests  do  not  promise  any  added  validity 
for  the  pilot  unless  the  carefulness  factor  itself  proves  to  be  valid.  This  is 
regarded  as  highly  unlikely.  If  anything,  one  might  predict  a  negative 
validity  of  the  carefulness  factor  for  pilots,  in  using  the  graduation- 
elimination  criterion.  Against  an  accident  criterion,  the  result  to  be  ex¬ 
pected  is  merely  an  interesting  open  question.  If,  as  indicated,  the  care¬ 
fulness  scores  are  not  valid  for  the  pilot,  but  are  valid  for  the  navigator, 
as  predicted  by  the  carefulness  hypothesis,  a  good  classification  instru¬ 
ment  would  be  available. 

The  test  Combat  Planes  offers  substantial  added  validity,  because  of 
its  unique  variance  in  Integration  I.  The  predicted  validity  of  0.18  de¬ 
pends  largely  upon  the  estimated  validity  of  0.25  for  the  Ii  factor  which 
in  turn  is  based  upon  slender  evidence. 

Another  test  of  special  interest  in  the  nonrotated  form  of  Speed  of 
Identification,  which  has  an  expected  validity  of  0.13.  This  is  much 
lower  than  that  predicted  or  obtained  for  the  rotated  form,  but  the  non¬ 
rotated  form  is  purer — its  unique  contribution  to  the  pilot  staninc  would 
be  presumably  just  as  large  and  its  irrelevant  variance  would  be  prac¬ 
tically  nil. 


CONCLUDING  STATEMENT 


Advantages  of  the  Factorial  Approach 

The  discussions  in  this  chapter  have  shown  several  advantages  of  the 
factor-analysis  approach  in  a  test-development  program.  Its  favorable 
features  as  they  appear  to  the  writer  may  be  summarized  as  follows : 

(1)  It  provides  an  exact,  quantitative  picture  of  tests  and  criteria  in 
terms  of  a  limited  number  of  stable,  meaningful  categories. 

(2)  It  enables  us  to  understand  why  some  tests  arc  valid  and  why 
others  are  not. 

(3)  It  makes  possible  the  substitution  of  one  test  for  another  in  terms 
of  equivalent  factor  patterns.11 

(4)  It  leads  to  the  discovery  and  development  of  pure  tests  whose 
contributions  are  unique.  Such  tests  arc  demanded  for  prrei-e  classifica¬ 
tion. 

(5)  It  leads  to  the  discovery  of  valid  factors  not  known  or  even  sus¬ 
pected  before. 


»!,  should  be  remembered  that  th«e  are  minimum  prediction.  of  traiidily,  »«k«  »mm  all 

ralid  factors  may  be  -^counted  for.  |e  MU,Iici  *cre  Wun  at  Kce.ler  Field  and  at 

Ck  Jl,stj  the  direction  of  jpaycholoKical  Research  Unit  No.  2.  l-argt  numbers  of 

Sheppard  Field,  ,u."' Kr„ half  div  batterifl  and  adrainiMered  in  »uch  order  that  correlation*  o< 

factor  analyte  theae  Urge  matricet.  (See  appendix  C.) 
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(6)  It  makes  possible  the  prediction  of  validities  of  tests  and  of  com¬ 
posites  before  the  event  > '  validation  by  ordinary  methods. 

(7)  It  should  in  tinv.  make  possible  a  much  more  enlightened  and 
direct  job  analysis  by  :  ispcction  of  jobs,  and  job  description  in  factor- 
category  terms. 

(8)  It  serves  as  a  gu  de  in  new  test  development,  making  possible  the 
avoidance  of  irrelevant  variances  such  as  numerical,  verbal,  or  reason¬ 
ing  when  not  wanted. 

(9)  It  makes  possible  the  compilation  of  new  test  batteries  to  meet 
new  selection  needs  when  new  criteria  arc  described  in  factorial  terms. 

Most  of  these  features  have  been  brought  out  and  their  applications 
have  been  illustrated  in  this  and  in  preceding  chapters.  Other  features 
will  receive  further  mention  and  use  in  the  concluding  chapter  which 
immediately  follows. 


CHAPTER  T17ENTY-N1NI _ 

General  Conclusions1 


The  preceding  chapter  brought  together  many  of  the  threads  of  dis¬ 
course  concerning  test  development  which  were  followed  in  somewhat 
isolated  but  systematic  fashion  in  earlier  chapters,  and  in  it  an  attempt 
was  made  to  obtain  a  unified  picture.  The  centralizing  principle  was  that 
of  factorial  structure  by  which  a  large  number  of  the  printed  tests  de¬ 
veloped  in  the  Army  Air  Forces  could  be  aligned  in  families  and  related 
to  validity  for  the  selection  of  pilot,  navigator,  and  bombardier. 

Unfortunately,  many  intellectual  and  perceptual  tests,  more  recently 
developed  and  never  analyzed,  aiid  almost  all  of  the  temperament  tests 
had  to  be  omitted  from  that  significant  type  of  picture.  There  remains, 
therefore,  the  need  for  a  final  glance  over  the  contents  of  earlier  chapters 
in  order  to  see  what  the  positive  gains  have  been ;  to  evaluate  what  was 
done;  and  to  note  what  was  left  undone.  From  such  a  review,  investi¬ 
gators  either  in  the  Air  Forces  or  elsewhere  may  be  more  profitably  guided 
in  the  next  steps. 

This  chapter  will  accordingly  present  first  a  summarizing  paragraph  or 
two  concerning  each  area  of  test  development.  F.ach  area  will  correspond 
usually  with  that  of  an  entire  chapter  on  tests.  Reference  will  also  be 
made  to  the  job-analysis  concepts  that  are  most  closely  related  to  the 
test  area  under  discussion.  Some  of  the  general  lessons  learned  in  the 
development  of  printed  tests  will  be  briefly  mentioned,  with  implications 
for  future  research  and  future  application  of  tests  in  classification  bat¬ 
teries.  Other  suggestions  which  more  naturally  grow  out  of  development 
of  apparatus  tests  and  motion-picture  tests,  and  out  of  the  experiences 
of  classification  testing,  will  be  found  in  other  reports. 

It  might  be  in  order  to  devote  some  space  to  the  implications  of  what 
has  been  learned  in  this  program  for  general  industrial  and  vocational 
testing.  It  is  believed,  however,  that  this  volume  should  be  confined  to  a 
descriptive  account  of  a  certain  segment  of  experiences  gained  by  aviation 
psychologists  in  a  particular  program.  It  is  also  believed  that  one  who 
reads  liberal  portions  of  the  volume  will  not  fail  to  find  implications 
for  more  general  testing  programs  if  he  reads  with  that  intention.  Some 
of  the  general  comments  made  in  the  later  part  of  this  chapter  may  serve 
to  direct  the  attention  of  such  a  reader. 

A  REVIEW  OF  THE  TEST  AREAS 

As  each  area  of  tests  is  mentioned,  several  considerations  will  be  given 
attention.  The  most  closely  related  job-analysis  categories  will  be  nien- 

*  Written  by  U«  rditar. 
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tionc<i  and  the  bearing  that  pr<  ited-test  development  and  research  have 
had  upon  those  categories  wi'i  be  pointed  out.  On  the  other  hand,  the 
fundamental  traits  or  factors  u  at  characterize  the  area  will  be  mentioned, 
with  emphasis  upon  vhatevei  unique  factors  the  area  has  to  offer.  The 
validity  of  each  area  or  subtrea  for  success  in  each  type  of  training 
will  be  given  a  summary  statement  in  nonnumerical  terms,  where  the 
facts  are  known,  and  explanations  of  validity  in  terms  of  known  factorial 
composition  will  be  cited.  The  needs  for  further  test  development  and 
research  will  be  pointed  out  and  the  kind  of  research  indicated. 

Verbal  Ability  Tests 

j 

The  list  of  20  job-analysis  traits  for  the  pilot,  which  was  given  most 
attention  in  the  program,  included  among  the  four  intellectual  traits  the 
category  of  comprehension.  Two  types  of  comprehension  tests  were  de¬ 
veloped:  verbal  and  mechanical.  The  latter  were  found  later  to  justify 
a  new  category  of  their  own,  namely,  the  mechanical  group.  This  left  the 
verbal-comprehension  tests  as  the  only  candidates  for  the  comprehension 
category. 

The  verbal  tests  fall  into  two  groups :  general  vocabulary  and  reading 
comprehension  (sec  ch.  5).  The  vocabulary  tests  proved  to  be  the  purest 
and  strongest  as  measures  of  the  verbal  factor.  This  factor  seems  to  be 
represented  in  the  navigator  criterion  only;  vocabulary  tests  have  no 
validity  for  either  bombardier  or  pilot.  In  fact,  they  have  a  slight  negative 
validity  for  the  pilot,  which,  taken  with  other  facts,  indicates  that  the 
verbal  factor  is  correlated  slightly  negatively  with  the  pilot  criterion. 
This  kind  of  relationship  would  surely  not  hold  for  the  lower  levels  of 
verbal  ability  in  which  one  would  expect  a  positive  correlation.  This 
reasoning  leads  to  the  inference  that  over  the  entire  IQ  range  (for  most 
IQ’s  are  predominantly  indices  of  verbal  comprehension)  the  regression 
of  pilot-training  success  on  the  verbal  factor  is  curvilinear. 

The  reading-comprehension  tests  used  in  the  Air  Forces  were  of  com¬ 
plex  factorial  composition,  and  as  a  result,  they  were  valid  to  some 
extent  for  all  three  r  ir-crcw  positions.  They  were  valid  for  the  navigator 
because  of  their  verbal  variance  (their  strongest  factor),  the  general 
reasoning  factor  (next  strongest),  to  a  smaller  extent  because  of  visuali¬ 
zation,  and  even  to  some  extent  because  of  a  slight  numerical  variance. 
They  were  valid  for  the  pilot  primarily  because  of  a  substantial  loading 
in  the  mechanical-experience  factor  and  to  a  less  extent  by  reason  of 
some  visualization  variance.  The  smallest  validity  was  for  the  bombardier, 
but  what  there  was  could  be  accounted  for  by  the  visualization  and 
numerical  components. 

Since  all  the  factorial  components  of  the  reading-comprehension  tests 
are  better  measured  by  purer  and  stronger  tests  for  them,  it  is  strongly 
urged  that  a  general  vocabulary  test  be  utilized  to  carry  the  burden  of 
measuring  the  verbal  factor  when  that  is  wanted. 
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Mathematical  Tests 

Mathematical  and  arithmetical  computation  arc  rarely  mentioned  among 
job-analysis  traits  in  connection  with  training,  but  as  will  be  seen  in 
chapter  1,  they  are  given  place  in  the  combat  studies.  Both  mathematical 
and  numerical-computation  tests  were  found  to  be  highly  valid  for  the 
navigator,  slightly  for  the  bombardier,  and  slightly  or  not  at  all  for  the 
pilot.  In  this  connection  it  should  be  said  that  the  Arithmetic  Reasoning 
test  is  not  treated  here  nor  in  chapter  6  as  a  mathematical  test,  but  rather 
as  a  reasoning  test. 

The  only  really  unique  feature  of  mathematical  and  numerical  tests 
is  their  variance  in  the  numerical  factor.  Although  this  factor  is  sub¬ 
stantially  weighted  in  all  the  mathematical  tests,  it  is  most  unambiguously 
and  satisfactorily  measured  by  the  Numerical  Operations  test,  which  also 
takes  much  less  time.  A  factor  identified  as  mathematical  background  is 
a  substantial  component  of  the  Mathematics  A  (general  mathematics)  test, 
but  it  is  slightly  more  efficiently  measured  by  the  Biographical  Data  Blank 
(navigator  score).  The  conclusion  is,  that  of  all  the  mathematical  tests 
tried,  Numerical  Operations  has  earned  a  permanent  place  wherever  a 
measure  of  sheer  numerical  facility  is  wanted.  No  other  mathematical 
tests,  as  such,  seem  to  be  fruitful  in  the  selection  and  classification  of 
air  crew. 

Reasoning  Tests 

Reasoning,  as  such,  was  never  mentioned  among  the  job-analysis  cate¬ 
gories.  The  concept  of  judgment  was  given  considerable  prominence, 
especially  in  connection  with  the  pilot,  however,  and  early  attempts  to 
analyze  judgment  from  the  psychologist's  armchair  led  inevitably  to  an 
examination  of  reasoning  tests  of  various  types.  While  the  Arithmetic 
Reasoning  test  was  first  developed  to  meet  some  of  the  needs  of  mathe¬ 
matical  tests,  it  was  shown  later  to  be  more  of  a  reasoning  test  and  so 
is  treated  in  this  group.  It  was  realized  early,  however,  that  verbal  and  , 
numerical  tests  were  not  valid  for  the  pilot,  and  so  nonverbal  and  non- 
numericai  reasoning  tests  were  vigorously  sought  that  might  be  valid 
and  might  to  some  extent  take  care  of  the  judgment  category  (sec  ch.  7). 

Most  reasoning  tests,  but  not  all,  prove  to  have  variance  in  a  factor 
identified  only  as  general  reasoning,  which  is  best  measured  by  the  Arith¬ 
metic  Reasoning,  Forced  Landings,  Pattern  Comprehension,  and  Spatial 
Reasoning  tests.  It  is  almost  as  highly  involved  in  the  navigator  criterion 
as  the  numerical  factor,  but  apparently  not  at  all  in  cither  the  pitot  or 
bombardier  criteria.  Thus,  so  far  as  this  type  of  reasoning  test  is  con¬ 
cerned,  no  progress  has  been  made  toward  covering  the  important  problem 
of  judgment  for  the  pilot. 

A  second  reasoning  factor,  perhaps  identifiable  as  the  ability  to  reason 
by  analogy,  is  strongest  in  the  f  igure  Analogies,  Pattern  Reasoning,  and 
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the  Gottschaldt  Figures  lefts.  It  may  have  some  small  validity  for  the  pilot, 
but  its  validities  for  bomb;  dier  and  navigator  arc  unknown. 

A  third  reasoning  facl(  r,  strongest  in  the  Spatial  Reasoning  and  De¬ 
coding  tests,  seems  to  have  a  negative  validity  for  pilots.  Nothing  can  be 
said  concerning  its  validity  for  other  specialties. 

In  general,  all  reasoning  tests  arc  difficult  to  purify.  The  type  of 
material  in  which  they  arc  given — words,  numbers,  figures  in  spatial 
arrangement — tends  in  itself  to  introduce  an  unwanted  variance.  Two 
or  more  reasoning  factors  are  also  likely  to  go  together  in  tests.  For¬ 
tunately,  for  navigator  selection,  both  numerical  and  general-reasoning 
factors  arc  highly  valid.  Fortunately,  also,  both  arc  invalid  for  the  pilot, 
so  that  a  discriminating  test  is  possible.  For  the  bombardier,  however, 
one  factor  is  probably  valid  and  one  is  not.  The  fact  that  we  can  sup¬ 
plement  the  Arithmetic  Reasoning  test  by  a  pure  numerical  test,  however, 
probably  takes  care  of  the  bombardier  and  vocations  similar  in  this 
respect. 

Judgment  Tests 

The  category  of  practical  judgment,  which  was  so  highly  regarded  by 
instructors,  received  considerable  attention  (see  ch.  8).  It  was  demon¬ 
strated  that  verbally  presented  practical  problems  of  the  predicament 
type  were  factorially  quite  complex,  including  variances  in  verbal,  gen¬ 
eral  reasoning,  and  mechanical-experience  factors,  as  had  been  antici¬ 
pated.  But  they  also  have  variances  in  a  new  factor  tentatively  called 
planning  and  in  a  critical  or  judicial  trait  which  may  be  called  judgment. 
Early  judgment  tests  owed  much  of  their  pilot  validity  to  the  mechanical 
component.  This  was  natural  in  that  the  person  with  a  good  background 
of  mechanical  knowledge  had  an  advantage  both  in  solving  the  verbal- 
prcdicamcnt  problems  as  well  as  in  pilot  training.  But  subsequent  find¬ 
ings  have  shown  that  the  judgment  factor,  as  such,  has  a  positive  contri¬ 
bution  to  make  to  prediction  oi  the  pilot  criterion.  One  type  of  item  that 
best  measures  this  factor  is  of  a  common-sense-dccision  variety.  Another 
is  of  the  work-planning  type.  Tests  of  practical  estimations  of  sizes, 
times,  and  distances  arc  also  related  to  this  factor. 

Near  the  end  of  the  war,  the  hypothesis  was  being  investigated  that 
another  component  that  might  be  called  thought  fluency  is  an  indepen¬ 
dent  contributor  to  practical  judijmeut.  The  facile  recall  of  pertinent  ex¬ 
periences  for  possible  use  in  everyday  predicaments  would  give  an  indi¬ 
vidual  more  potential  solutions  from  which  to  choose  in  a  given  unit  of 
time.  This  hypothesis  should  be  followed  up  if  possible.  The  construction 
of  tests  of  fluency  using  answer  sheets  presented  new  practical  problems, 
hut  they  did  not  appear  to  be  insurmountable. 

A  judgment  test  of  the  verbal-problem  type  was  in  the  classification 
battery  during  the  last  year  of  the  war.  Though  the  early  tests  of  this 
kind  were  factorially  complex,  experience  showed  that  they  could  be 
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somewhat  purified.  The  test  is  lime-consumiug  (requiring  an  average  of 
at  least  1  minute  per  item)  and  reliability  is  typically  low.  There  seems 
little  likelihood  that  reliability  can  be  substantially  improved  within  prac¬ 
tical  time  limits,  but  the  contribution  to  the  validity  of  the  pilot  stanine 
is  probably  undeniable,  in  spite  of  low  reliability.  The  test  should  remain 
in  the  battery  until  something  better  covering  the  judgment  factor  is 
demonstrated. 

It  seems  probable  that  other  experiential  background  than  the  mechan¬ 
ical  is  important  in  connection  with  what  the  instructors  call  practical 
judgment.  The  general-information  tests  may  owe  a  part  of  their  pilot 
validity  to  this  source,  though  analyses  have  shown  no  components  in 
these  tests  except  the  mechanical  experience  and  another  factor  identi¬ 
fied  as  pilot  interest.  In  this  connection,  as  well  as  for  other  reasons  in 
the  study  of  practical  judgment,  it  might  pay  to  collect  from  the  training 
fields  instances  of  good  and  poor  judgment  with  a  view  to  seeing  what 
kinds  of  information  would  have  been  pertinent  to  a  successful  solution 
to  the  practical  problem.  This  would  be  another  source  of  items  for  in¬ 
formation  tests. 

Foresight  and  Planning  Test* 

This  category,  which  was  given  prominent  place  in  analyses  of  the 
pilot’s  job,  seems  to  need  drastic  revision  as  a  concept.  The  several  tests 
that  were  developed  to  measure  the  hypothetical  trait  failed  to  show  any 
single  element  that  was  common  lo  them  all,  when  they  were  statistically 
analyzed  (see  eh.  9).  The  a  priori  grouping  of  tests  in  this  area  under 
the  categories  of  pathway  planning,  economical  procedures,  and  planning 
by  deduction  also  failed  to  find  support  in  subsequent  analyses.  Tests  in 
this  area  are  generally  complex  faetorially,  with  perhaps  strongest  affilia¬ 
tions  with  the  general-reasoning  and  perceptual  speed  factors,  but  some 
of  them  did,  in  fact,  have  in  common  a  new  factor  which  was  called 
planning.  It  is  not  unique  to  planning  tests,  however,  having  been  found 
in  some  judgment  tests.  The  slightly  promising  validity  of  the  planning 
factor  for  pilot  training  justifies  further  work  toward  better  identifica¬ 
tion  of  it  and  improvement  of  tests  for  it.  Its  validities  for  bombardier 
or  navigator  arc  territories  yet  to  be  explored. 

Integration  Testa 

Integration  tests  were  dcvelojicd  with  the  intention  of  measuring  by 
means  of  printed  tests  the  most  valid  asjiects  of  the  very  successful  Com¬ 
plex  Coordination  test  (sec  eh.  10),  Ability  to  integrate  actisitics  in  re¬ 
sponse  to  perceptually  complicated  situations  was  the  working  hypothesis. 
While  the  key  to  the  valid  intellectual  couijHiuent  of  the  Complex  Coordi¬ 
nation  tesl  proved  to  lie  elsewhere,  integration  tests  were  found  to  con¬ 
tribute  three  new  factors  apparently  I  raving  to  do  with  different  aspects 
of  mental  sets — |>ersistcnce,  ndnptab.lity,  and  span.  1  he  first  two  promise 
some  pilot  validity,  while  the  thin!  seems  to  has e  a  negative  validity. 
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Their  validitie:  for  navigator  and  bombardier  are  unknown.  From  the 
superficial  aspe's  of  the  tests  one  might  expect  some  navigator  validity. 

Integration  U.ts  arc  relatively  free  from  factors  already  well  known, 
except  for  general  reasoning  in  one  or  two,  but  some  exhibit  more  than 
one  of  the  integration  factors.  The  relative  uniqueness  of  these  tests 
should  appeal  to  those  who  are  looking  for  selective  tests  where  variances 
in  mental-set  abilities  seem  inqxirtant.  For  the  sake  of  determining  fac* 
tor  validities  alone,  the  tests  best  representing  those  factors  should  be 
validated  for  bomtardier  and  navigator,  and  confirming  studies  of  pilot 
validity  arc  needed.  t 

Memory  Tests 

A  systematic  exploration  of  the  area  of  memory  tests  that  superficially 
resembled  memory  tasks  in  aviation  revealed  three  separate  memory  fac¬ 
tors,  all  probably  valid  to  a  small  degree  for  the  pilot  (see  eh.  11).  The 
first  {actor  was  called  paired-associates  memory,  since  it  is  characteristic 
of  paired-associates  memory  tests.  It  is  probably  identifiable  with  Thur- 
stonc’s  rote-memory  factor.  The  second  is  a  visual-memory  factor,  which 
Carlson  had  previously  discoverer!.  It  is  characteristic  of  tests  in  which 
both  learning  and  recognition  tests  are  pictorial  and  identical.  Th:  third 
factor  was  confined  to  two  tests  in  which  object  and  name  are  paired. 
There  is  some  indication  of  navigator  validity  for  some  of  the  tests  in 
this  group,  but  further  study  of  this  is  needed. 

One  test.  Memory  for  Tactical  Plans,  showed  pilot  validity  strikingly 
beyond  that  attributable  to  any  known  factors,  including  the  three  mem¬ 
ory  factors  mentioned  above.  Whether  the  valid  component  is  a  memory 
or  a  nonmemory  factor  is  not  known.  If  a  memory  factor,  whether  it  is 
associated  with  the  2-hour  delay  between  learning  and  recall  quiz  or  to 
the  type  of  materia!  (verbal  instructions)  is  unknown.  It  can  possibly 
be  identified  with  the  integration  I  factor  (persistence  of  mental  set),  or 
integration  I  niay  1  e  a  memory  factor — memory  for  instructions.  Such 
useful  unknown  variance  is  a  challenge  to  discover  its  nature  and  to 
capitalize  upon  it  to  a  greater  extent.  This  test  could  be  improved  for 
the  pilot  by  reduction  of  its  verbal  variance.  Its  variance  in  visualization 
is  dispensable,  also,  since  it  is  covered  belter  in  other  tests. 

Visualization  Teats 

"\  isualizalion  of  llighl  course"  is  one  of  the  20  prominent  job-analysis 
traits  in  the  pilot  list.  The  concept  inspired  very  few  lest  ideas,  but  re¬ 
sults  of  research  have  shown  abundantly  that  a  factor  of  visualization 
docs  indeed  exist,  and  that  it  is  a  significant  component  not  ordy  of  the 
pilot  criterion  but  also  of  the  navigator  and  bombardier  criteria  as  well 
(see  eh.  12).  Whether  or  not  the  concept  of  “visualization  of  flight 
course”  is  ck»e  to  the  meaning  of  the  visualization  factor  is  an  unan¬ 
swered  question,  but  one  that  docs  not  particularly  nutter  unless  it  con¬ 
tains  important  aspects  that  the  factor 
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A  variety  of  tests  measures  it,  some  of  which.  Mechanical  Principles, 
for  example,  were  unsuspectingly  develop'd  for  quite  different  purposes. 
The  ability  seems  to  involve  visual-thinking  activity  in  which  objects 
must  be  moved  or  transformed  in  order  to  solve  a  problem.  It  is  distinct 
from  visual  memory  on  the  one  hand,  and  from  spatial  relations  on  the 
other.  Its  separation  front  visual  memory  and  the  space  factor,  which 
has  traditionally  been  described  as  spatial  visualization,  is  one  of  the  im¬ 
portant  psychological  findings  of  the  program.  It  is  not  a  feature  of 
planning  tests,  as  had  been  expected. 

Xo  visualization  test  yet  developed  is  pure  for  the  factor.  General  rea¬ 
soning  is  the  most  common  secondary  factor,  and  some  mechanical  tests 
have  visualization  as  their  secondary  factor.  The  general-reasoning  vari¬ 
ance  might  be  eliminated  by  ridding  the  test  of  the  more  complicated 
and  difficult  items.  It  is  believed  that  oral,  verbal  presentation  of  the 
items  is  probably  best,  with  some  control  of  the  time  factor  for  each 
item.  Some  of  the  later  developed  tests  made  use  of  these  suggestions. 
It  is  important  that  they  be  analyzed  to  test  the  implied  hypotheses. 

Mechanical  Tests 

Many  varieties  of  mechanical  tests  were  tried  out  in  the  program— 
Mechanical  Information,  Mechanical  Principles,  Mechanical  Movements, 
Mechanical  Comprehension,  Mechanical  Functions,  Tool  Function,  and 
Physical  Principles,  not  to  mention  Pattern  Assembly  and  Pattern  Com¬ 
prehension  (see  ch.  13).  Both  mechanical  insight  and  mechanical  experi¬ 
ence  were  thus  surveyed  from  many  directions.  Familiarity  with  machin¬ 
ery,  a  natural  “knack”  for  understanding  new  mechanisms,  or  an  inborn 
aptitude  for  success  in  mechanical  tasks,  if  such  there  be,  were  all  thought 
to  he  covered  by  the  range  of  tests  included. 

Experience  bore  cut  the  expectation  that  mechanical  tests  would  be 
valid  for  air-crcw  selection.  They  were  highly  valid  for  the  pilot  and  to 
a  small  but  useful  degree  for  the  navigator.  It  was  surprising  to  find 
them  not  valid  for  the  bombardier,  in  view  of  his  dependence  upon  un¬ 
derstanding  of  the  bomb  sight  and  the  autopilot  in  his  training. 

The  secret  of  the  validity  that  can  be  called  unique  to  mechanical  tests 
lies  in  the  mechanical-experience  factor.  It  would  ap|>car  that  any  purely 
mechanical  aptitude  is  largely  an  acquired  trait,  because  the  factor  .s  by 
far  the  strongest  in  the  Mechanical  Information  test.  This  fact  docs 
deny  that  mechanical  experience  is  gained  somewhat  in  projiortion  to  th*c 
jxjvvcr  to  gain  it  when  equal  op|>ortunity  is  available.  ()p|»ortunitus  for 
mechanical  cxjH'rionce  are  not  equal,  however,  and  hence  it  would  have 
been  reasonable  to  expect  two  factors,  one  attributable  to  the  power  to 
learn  and  the  other  to  opportunity.  It  may  Ik*,  in  spite  of  inequalities  of 
opportunity,  that  two  such  factors  are  inextricably  mingled  in  the  analy¬ 
sis  ,\t  .my  rate,  there  is  no  common  variance  in  mechanic  d  tests  to  Ik* 
accounted  for,  other  than  well-identified  rv.mnechanical  factors  awl  the 
factor  so  characteristic  of  mechanical -information  tc.-us  supporting  the 
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experience  hypothesis  in  naming  tiic  factor  are  the  very  substantial  load¬ 
ings  for  it  in  the  Biographical  Data  Blank  (pilot  score)  and  in  the  Gen¬ 
eral  Information  test. 

The  development  of  mechanical  tests  for  air -crew  selection  and  classi¬ 
fication  may  be  regarded  as  having  reached  a  satisfactory  status.  A  me¬ 
chanical-information  test,  which  may  also  well  include  items  on  auto 
driving,  tool  function,  and  motor  malfunctions,  best  represents  this  area. 
Not  inemded  in  the  survey  of  this  area  were  L.sts  of  manipulations  of 
tools  and  machines.  These  abilities  call  for  psychomotor  tests.  It  is  be¬ 
lieved,  from  the  limited  factor-analysis  experiences  with  psychomotor 
tests  which  have  been  mentioned  in  this  volume,  that  no  manipulative 
abilities  peculiar  to  tool  handling  or  machine  operating  will  be  found  that 
a-e  unique.  More  general  psychomotor  abilities  will  probably  take  care  of 
this  aspect  of  mechanical  endeavors. 

General  Information  Tests 

General-information  tests  were  found  to  be  valid  to  a  practical  degree 
for  both  pilot  and  navigator,  but  only  for  the  pilot  did  they  make  a 
unique  contribution  (sec  ch.  14).  Their  most  valid  types  of  item  for 
pilots  were  on  aviation  information,  flying  information,  mechanical  in- 
formats  ^tive-sports-and-hobbies  information,  and  automobile-driv¬ 
ing  information.  The  unique  contribution  was  in  a  factor  called  pilot 
interest.  Two  substantia  secondary  variances  were  in  the  verbal  and 
mcchanical-cxpcricnce  factors  which  are  covered  by  other  tests. 

In  addition  to  the  coverage  of  pilot  interest,  general-information  tests 
were  developed  in  connection  with  other  hypotheses,  e.  g.,  masculinity- 
femininity  discrimination,  and  leadership  qualities.  With  what  effective¬ 
ness  they  will  discriminate  men  on  these  variables  is  still  to  be  deter¬ 
mined.  These  aspects  of  information  tests  should  be  explored  further. 

Perceptual  Speed  Tests 

Tlie  concept  of  perceptual  speed,  introduced  by  Thurstone,  seems  likely 
to  endure,  judging  by  the  ease  and  the  frequency  with  which  the  factor 
by  that  name  is  verified,  and  the  ease  of  constructing  pure  tests  for  it.  It 
appears  to  be  a  significant  variable  in  all  of  the  air-crew  positions  and  it 
is  well  measured  by  three  tests  in  the  classification  battery  (see  ch.  16). 

Tests  of  a  clerical  type — such  as  Graph  Reading,  Table  Reading,  and 
Number  Reading — were  treated  in  this  area  because  they  probably  have 
a  substantial  variance  in  the  perccptual-spccd  factor  and  because  they 
are  highly  speeded  tests.  Judging  by  their  face  appearance  and  their  very 
high  navigator  validities,  they  probably  have  even  greater  loadings  in  the 
numerical  and  perhaps  spatial-relations  factors.  In  pile  of  their  short¬ 
ness,  and  consequent  limited  reliability,  because  oi"  their  high  navigator 
va'  dities  and  their  simplicity  of  administration  and  scoring,  they  de- 
serv.  serious  attention  whenever  navigators  must  be  selected  in  limited 


time.  They  arc  probably  more  complex  than  is  desirable  for  general  clas¬ 
sification  testing. 

Form  Perception  Tests 

Tests  in  this  area  were  developed  to  help  to  round  out  an  exploration 
of  the  general  field  of  perceptual  tests  rather  than  to  meet  any  recog¬ 
nized  job-analysis  requirement.  Two  pattern- format  ion  tests  had  small 
to  moderate  validities  for  pilot  selection,  but  this  could  be  accounted  for 
in  terms  of  already  known  factors  adequately  covered  by  better  tests 
(see  ch.  17).  Two  completion  tests  bad  almost  no  pilot  validity  but  prob¬ 
ably  contain  new  factors  not  already  listed  in  this  program.  Pattern- 
analysis  tests  promise  low  to  moderate  pilot  validity,  the  Gottschaldt  test 
in  particular.  The  latter  is  factorial!}’  complex  and  might  be  cultivated 
for  its  possible  variance  in  the  second  reasoning  factor.  There  is  evi¬ 
dence,  from  outside  the  program,  that  geometric-illusions  tests  contribute 
a  new  factor.  Neither  factorial  nor  validity  information  is  available  con¬ 
cerning  them  from  within  the  program.  Should  their  unique  contribution 
be  found  fairly  strong  and  univocal,  they  should  be  validated  against 
training  criteria  in  several  specialties. 

Size  and  Distance  Estimation  Testa 

» 

In  connection  with  these  tests  the  most  significant  finding  is  the  dis¬ 
covery  of  a  length-estimation  factor  which  is  undoubtedly  valid  for  the 
pilot  and  perhaps  for  other  air  crew  (see  ch.  18).  After  considerable 
effort  to  make  size-estimate  items  realistic  and  to  obtain  judgments  in 
aviation-material  settings,  it  was  found  after  all  that  the  simplest  form 
of  item — namely,  single  straight  lines — seems  best.  The  factor  was  also 
found  in  other  perceptual  tests  not  designed  as  space-estimation  tests, 
for  example,  the  Pattern  Assembly  test,  a  modified  Minnesota  Form 
Board  test. 

Angular-judgment  tests  may  be  found  to  measure  another  space-judg¬ 
ment  factor  that  is  valid  for  selection.  Tests  of  this  type  were  developed 
late  in  the  program  and  have  not  been  analyzed  or  fully  validated. 

Problems  of  the  generality  of  the  ability  to  judge  extents  in  one  di¬ 
mension  with  distance  judgments  in  three  dimensions,  or  of  judgments 
of  short  lines  seen  close  by  with  judgments  of  lines  at  great  distances, 
have  not  as  yet  been  solved.  Discussions  of  theoretical  considerations  of 
these  problems  will  be  found  in  report  No.  7  on  motion-picture  tests. 

Spatial  Tests 

These  tests  (sec  ch.  19)  proved  to  be  very  highly  valid  for  the  pilot, 
moderately  so  for  the  navigator,  and  significantly  so  for  the  bombardier. 
The  discovery  that  a  large  part  of  the  validity  of  psychomotor  tests  can 
be  attributed  to  the  spatial-relations  factor  and  that  this  factor  can  be 
measured  as  well,  and  perhaps  even  better,  by  means  of  printed  tests. 


859 


such  as  Instrument  Comprehension  II  and  possibly  Planning  a  Course,  is 
of  considerable  practical  importance. 

The  separation  of  space  appreciation,  as  such,  from  visualization  may 
be  regarded  as  a  distinct  step  forward  in  mental  measurement.  The  find¬ 
ing  of  a  second  space  factor  which  is  slightly  valid  for  pilots  and  not 
measured  by  any  classification-battery  test  is  also  a  positive  gain.  The 
presence  and  nature  of  a  third  possible  space  factor  needs  considerably 
more  investigation  before  the  factor  can  be  accepted  as  real  or  its  char¬ 
acteristics  defined.  At  present,  it  seems  to  be  strongest  in  two  psychomotor 
tests,  Two-IIand  Coordination  and  Rotary  Pursuit,  but  its  substantial 
presence  in  two  printed  tests  (plotting  tests — rights  score)  suggests  that 
like  the  first  space  factor,  if  it  is  genuine,  it  also  is  amenable  to  measure¬ 
ment  by  printed  tests. 

Orientation  Tests 

Job  analysis  for  the  pilot  stresses  ability  to  maintain  orientation  in 
space  as  an  important  perceptual  trait.  This  inspired  a  large  number  of 
test  ideas,  and  a  great  many  kinds  of  orientation  tests  were  developed 
(sec  ch.  20).  Insufficient  work  has  been  done  with  them  to  determine 
whether  or  not  they  have  any  unique  valid  variance  to  offer.  Analyses 
thus  far  have  revealed  the  already  well  recognized  factors  of  perceptual 
speed,  spatial  relations,  and  visualization  in  them.  They  deserve  further 
study  if  only  to  identify  their  unknown  nonchance  variance  and  to  deter¬ 
mine  its  validity.  Orientation  tests  have  shown  considerable  pilot  and 
navigator  validity,  but  it  is  possible  that  the  factors  just  mentioned  and 
others  that  are  known  will  account  for  most  of  that  validity.  There  is 
still  the  possibility  that  a  compass-orientation  factor  can  be  brought  out, 
but  its  bearing  upon  air-crew  selection  is  problematical.  Many  of  the 
later  developed  tests  were  not  yet  analyzed  or  validated.  This  should  be 
done. 

Tests  of  Set  and  Attention 

Tests  of  attention  had  quite  low  pilot  validity  but  seem  to  have  more 
promise  for  navigator  validity  (see  ch.  21).  Whether  the  latter  can  be 
{*t»r,bu:cd  to  an  attention  factor  or  factors  is  unknown.  An  attention 
factor  in  such  tests  as  were  used  had  been  reported  by  earlier  investi¬ 
gators. 

The  hypothesis  of  a  change-of-sct  trait  was  found  to  be  without  sup¬ 
port.  Tests  developed  to  measure  it  were  almost  entirely  lacking  in  com- 
nnmality.  This  finding  is  in  harmony  with  prewar  results  on  persevera¬ 
tion  tests,  if  we  may  regard  change  of  set  as  the  opposite  of  persevera¬ 
tion. 

The  integration  tests,  since  they  apparently  turned  out  to  be  meas¬ 
ures  of  various  aspects  of  mental  set,  belong  in  the  present  category.  At 
first  thought,  there  is  an  apparent  discrepancy  between  the  finding  of  the 
factor  integration  II  (adaptability  of  set)  and  the  failure  to  find  a 
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changc-of-sct  factor.  Integration  II  was  found  strongest  in  the  following 
Directions  test.  The  difference  in  the  two  kinds  of  test  must  be  that  in 
the  latter  the  examinee  is  led  to  expect  changes  in  set,  and  he  can  or 
cannot  adjust  himself  to  them.  In  the  changc-of-sct  tests,  every  condi¬ 
tion  leads  him  not  to  expect  a  change  of  set.  Changes  that  do  occur  arc 
probably  more  or  less  fortuitous.  Although  the  reliabilities  of  change- 
of-set  tests  could  not  well  be  determined,  one  might  expect  them  to  be 
rather  low,  except  as  they  depend  upon  other  factors.  Because  of  the 
disparateness  of  the  kinds  of  material  and  tasks  in  different  changc-of- 
set  tests,  other  factors  likewise  failed  to  appear.  It  was  to  avoid  the 
appearance  of  inconsistency  that  integration  II  was  defined  provisionally 
as  a  matter  of  adaptability  of  set  rather  than  flexibility  of  set. 

The  job-analysis  category  of  attention  is  called  division  of  attention. 
No  test  was  designed  specifically  for  that  aspect  of  attention,  but  it  is 
possible  that  one  or  more  of  the  integration  factors  come  close  to  the 
concept  of  division  of  attention ;  but  which  ones  they  may  be  cannot  be 
decided  offhand.  By  definition,  as  at  present  envisaged,  integration  III 
would  seem  logically  closest,  but  that  is  the  one  that  appears  to  have 
some  negative  validity  for  the  pilot,  and  this  does  not  agree  with  the 
job-analysis  idea.  Further  study  of  what  is  actually  meant  by  division 
of  attention  of  the  pilot  in  action  is  needed.  It  may  well  be  that  apparatus 
tests  are  demanded  for  it. 

Personality  Inventories 

A  number  of  commercial  personality  inventories  of  the  questionnaire 
type  were  tried  in  order  to  assess  the  general  promise  of  this  kind  of  ap¬ 
proach  to  suitable  temperament  qualities  in  flying  trainees  and  combat 
aviators  (see  ch.  23).  These  tests  generally  failed  to  exhibit  pilot  validity 
for  the  training  criterion  when  scored  with  the  already  established  keys. 
There  were  exceptions  for  three  inventories,  in  which  keys  gave  significant 
validities  between  0.10  and  0.20. 

The  number  of  valid  items  yielded  by  these  inventories  was  generally 
small,  and  sets  of  apparently  valid  items  based  on  two  independent  s«un- 
pies  did  not  contain  many  items  in  common.  Cross-validation  of  empiri¬ 
cal  keys  derived  from  sets  of  valid  items  generally  showed  failure  of  the 
validity  of  the  items  to  hold  up,  with  one  or  two  exceptions.  It  is  urged, 
however,  in  view  of  the  paucity  of  valid,  printed,  temperament  tests, 
that  the  most  valid  items  (preferably  those  with  correlation  coefficients 
with  the  criterion  that  are  significant  at  the  1  percent  level)  be  selected 
from  all  inventories  and  assembled  for  future  validation  as  a  s'.ngle  test, 
against  all  training  criteria.  Validation  against  combat  criteria,  which 
would  have  been  most  desirable  for  all  temperament  tests,  was  not  prac¬ 
ticable.  and  since  the  close  of  the  war  is,  of  course,  impossible. 

Other  experiences  with  tests,  as  well  as  these  with  inventories,  lead 
to  the  conclusion  that  one  stands  the  chance  of  most  positive  gains  by 
designing  tests  and  items  for  specific  purposes.  An  exception  to  this 
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generalization  is  that  tests  known  to  measure  well-recognized  factors  arc 
adaptable  wherever  those  factors  arc  significant  components  of  the  cri¬ 
teria  one  desires  to  predict.  One  might,  therefore,  be  able  to  construct 
many  new  questions  more  suitable  for  air  crew  selection,  if  the  ques¬ 
tionnaire  type  of  item  is  what  one  wants.  It  is  j>ossiblc  that  better  types 
of  items  would  serve  the  same  purjxjse ;  types  that  do  not  have  some  of 
the  same  objectionable  features.  Under  the  present  circumstances,  how¬ 
ever,  it  seems  desirable  to  follow  out  the  suggestion  of  the  preceding 
paragraph,  to  have  the  satisfaction  of  knowing  whether  questionnaires 
do  or  do  noi  have  anything  unique  to  offer;  and,  if  they  have  anything 
to  offer  at  all,  whether  other  types  of  items  can  serve  the  same  purpose. 

Preference  Inventories 

Vocational  interest  and  other  preference  inventories  were  studied  in 
the  same  manner  as  personality  questionnaires  with  quite  similar,  though 
less  promising,  results  (see  ch.  23).  It  is  probable  that  further  work  on 
this  type  of  instrument  would  be  unfruitful  as  compared  with  other 
types  of  interest  tests  developed  specifically  for  the  AAF  with  rather 
gratifying  results. 

Projective  Tests 

Neither  of  the  best  known  projective  tests — Rorschach  and  Thematic 
Apperception — showed  promise  of  a  practical  degree  of  validity  for 
selection  to  meet  a  pilot-training  criterion  (see  ch.  24).  Adaptation  of 
the  Rorschach  to  group  testing  yielded  results  that  seem  to  be  worth 
following  up.  It  is  believed,  however,  that  new  ink  blots  developed  for 
the  purpose  would  be  desirable.  Adaptations  of  the  thematic  principle 
to  printed  group  tests  were  tried  in  a  number  of  ways.  None  that  was 
validated  against  the  pilot-training  criterion  was  promising.  Validation 
against  a  more  pertinent  criterion  in  which  psychoncurosis  plays  a  more 
common  role  would  be  most  desirable,  for  these  and  other  temperament 
tests. 

Observational  Methods 

These  constitute  a  heterogeneous  list  of  procedures  in  which  the  evalu¬ 
ations  are  in  the  form  of  ratings  by  observers.  Observations  of  students 
were  made  under  various  kinds  of  situations;  during  the  taking  of  psy- 
chomotor  tests,  during  rest  periods  between  psychomotor  tests,  and  dur¬ 
ing  interaction-test  situations.  Typically,  a  numlier  of  traits  were  looked 
for  and  rated  (sec  ch.  24). 

Like  all  personal  ratings  of  traits,  these  evaluations  suffer  from  the 
subjectivity  factor  of  the  human  observer.  Individual  differences  among 
raters  both  as  to  numerical  values  and  as  to  qualities  emphasised  are 
bound  to  occur  even  under  the  best  indoctrination.  Wherever  significant 
validities  for  such  ratings  may  be  found,  it  is  desirable  to  set  up  hypoth- 


eses  as  to  the  significant  traits  involved  and  then  to  seek  objective,  quan¬ 
titative  indicators  of  the  same  traits. 

An  incidental  finding  that  is  of  interest  in  connection  with  observa¬ 
tions  was  that  ratings  of  goodness  of  physical  appearance  of  students  as 
they  took  their  group  tests  had  zero  validity  (0.03)  against  the  pilot¬ 
training  criterion.  The  correlation  of  these  ratings  with  evaluations  based 
upon  personal,  quasi-psychiatric  interviews  was  so  low  (0.13)  as  to  indi¬ 
cate  that  physical  appearance  as  such  has  a  minimal  bearing  upon  the 
interview  result.  The  reliability  of  the  interview  rating  is  unknown. 

Clinical  Type  Procedures 

Recognized  as  being  impractical  in  a  mass-testing  program,  clinical- 
type  procedures  which  emphasize  a  global  approach  to  personality  were 
nevertheless  given  an  experimental  trial  (sec  ch.  24).  Ratings  of  prob¬ 
able  success  in  training  were  made,  taking  into  account  a  large  mass  of 
data  concerning  each  student.  The  sources  of  information  were  an  hour 
interview,  scores  from  Rorschach  and  Thematic  Apperception,  and  simi¬ 
lar  tests,  and  observational  data.  The  over-all  ratings  showed  practically 
no  validity  for  predicting  the  training  criterion.  This  was  true  in  spite 
of  the  fact  that  intimate  observations  of  men  in  pilot  training  had  led  to 
the  belief  that  personality  traits  were  of  considerable  moment  in  students’ 
adjustments  to  training,  and  probably  to  success  in  training. 

While  this  finding  does  not  also  answer  the  question  whether  the 
psychoncurotic-pronc  individual  can  be  detected  by  similar  methods,  the 
costliness  of  the  procedures  in  personnel  time  and  the  subjectivity  of  the 
evaluations  are  good  reasons  for  hesitation  to  pursue  the  matter  further 
in  a  program  in  which  an  important  goal  is  objectivity.  Proponents  of 
the  global  approach  and  the  clinical-type  procedures  that  go  with  them 
may  be  able  to  say  that  the  approach  was  not  given  a  complete  trial,  and 
members  of  the  program  will  be  ready  to  admit  that  this  is  so.  When¬ 
ever  any  such  project  fails  it  can  always  be  said  that  the  optimal  pro¬ 
cedures  were  not  employed.  The  reply  to  such  an  assertion  is  tlut  rela¬ 
tively  minor  shortcomings  should  not  be  responsible  for  complete  fail¬ 
ure.  There  may  be  room  for  debate,  of  course,  as  to  whether  a  short¬ 
coming  is  minor  or  is  a  keystone. 

Masculinity-Femininity  Tests 

Explorations  were  started  with  masculinity- femininity  tests  of  the 
traditional  type  (Terman-Miles  and  Goodcnough)  and  of  information 
tests  presumed  to  be  discriminatory  of  masculine  versus  feminine  char¬ 
acteristics  (sec  ch.  25).  The  stimulus  for  this  was  the  hypothesis.,  derived 
from  observations  in  conilnt  zones,  that  the  masculine-type  man  is  on 
the  whole  a  letter  leader  ami  also  probably  less  psychoneurotic-prone.  It 
was  also  believed  that  the  trait  would  show  up,  though  to  a  less  degree, 
as  between  graduates  and  chniiiues  in  training. 
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The  traditional  tests,  when  applied  to  aviation  students,  have  shown 
very  low  internal  consistencies.  A  single  sex  would  naturally  be  ex¬ 
pected  to  show  less  internal  consistency  than  a  mixed  group,  but  there 
should  still  be  significant  di /Terences  within  each  sex.  Several  informa¬ 
tion  tests  were  developed  but  none  was  validated  against  the  training 
criteria.  They  should  also  be  item-validated  by  carefully  sampling  large 
groups  of  males  and  females  in  representative  populations  in  the  age 
range  of  aviation  students,  in  order  to  test  the  hypothesis  more  fully. 
The  failure  of  the  masculinity  score  of  the  Guilford-Martin  Inventory 
of  Factors  GAMIN  to  correlate  with  the  pilot-training  criterion  (see 
ch.  23)  is  one  indication  of  the  probable  result  with  other  masculinity- 
femininity  tests. 

Carefulness  Tests 

These  tests  were  developed  on  the  basis  of  a  hypothesis  arrived  at 
after  months  of  study  of  navigation  training  by  the  psychological  re¬ 
search  project  (navigation) ;  the  hypothesis  was  that  critical  errors  are 
made  because  the  navigator  is  careless  in  minor  details.  Four  tests  of  a 
complex  clerical  type,  with  some  face  validity  for  navigation,  were  de¬ 
veloped  and  analyzed  (see  ch.  25).  It  was  a  striking  finding  that  their 
error  scores  yielded  a  new  common  factor  practically  independent  of  the 
rights  scores  for  the  same  tests.  This  seemed  to  be  a  carefulness  factor, 
such  as  the  hypothesis  had  called  for.  Its  validity  for  the  navigator  has 
'  not  yet  been  determined  and  is  one  of  the  urgent  postwar  navigator- ' 
selection  problems.  Incidentally,  the  finding  of  such  divergent  commu- 
nalitics  between  rights  and  wrongs  scores  for  the  same  tests  led  to  re¬ 
newed  scrutiny  of  similar  phenomena  in  other  tests  and  a  reexamination 
of  the  general  utility  of  formula  scores. 

Fear  and  Tension  Tests 

Tenseness  and  fear  and  apprehension  arc  given  as  two  separate  cate¬ 
gories  in  the  job-analysis  list  of  pilot  traits.  The  theory  behind  test  con¬ 
struction  implicitly  treats  them  as  one,  though  tenseness  as  a  symptom 
is  measurable  better  by  apparatus  tests,  while  tendencies  to  fear  are 
amenable  also  to  testing  with  printed  instruments. 

Tests  based  upon  expressions  of  opinions  and  attitudes,  aimed  at 
assessing  generalized  fear  and  nervousness,  were  only  minimally  satis¬ 
factory  when  validated  against  the  pilot  criterion  (see  ch.  25).  The  vari¬ 
ance  was  probably  uni<|ue,  but  the  correlation  with  the  pilot  criterion  so 
low  that  the  contribution  to  the  composite  score  was  insufficient  for 
practical  gain.  The  exact  source  of  even  this  low  validity  is  yet  to  be 
determined.  Judging  from  analogous  results,  we  might  find  it  not  to  be 
fear  or  nervousness  at  all.  Hut  whatever  it  is,  the  effort  to  define  it 
might  be  worth  while. 

Confidence  Te»U 

I.ack  of  confidence  is  mentioned  in  the  list  of  traits  significant  in  navi- 
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gator  and  bombardier  training,  but  not  among  those  (or  the  pilot.  Qini- 
cal  observations  by  members  of  the  program,  however,  tend  to  relate  the 
trait  also  to  pilot-training  success. 

Of  several  methods  tried,  only  one,  which  was  based  upon  self-ratings 
of  performance  on  psychomotor  tests,  was  valid  for  pilot  training,  and 
this  to  a  practical  degree  (see  ch.  25).  The  score  was  the  divergence 
between  the  student’s  rating  and  the  actual  level  of  performance.  The 
more  realistic  the  judgment,  the  greater  the  student’s  chances  of  gradua¬ 
tion.  It  is  not  known  whether  this  validity  should  be  attributed  to  self- 
confidence  or  to  some  other  aspect  of  the  rating.  The  subjective  clement 
in  ratings  is  again  a  cause  for  restraint  against  using  the  device  for  selec¬ 
tion  and  classification. 

Social  Intelligence  and  Leadership  Tests 

The  need  of  measures  to  show  how  adept  and  how  well  adjusted  an 
individual  is  in  social  relationships  is  urgent  as  one  aspect  of  the  problem 
of  the  selection  of  leaders.  Internally  consistent  sets  of  items  were  de¬ 
veloped  for  tests  in  thl»  area,  but  the  nature  of  the  communality  was  not 
established,  and  no  validation  data  are  available  (see  ch.  25).  Although 
the  program  was  called  upon  to  supply  a  “mental-alertness”  score  and  a 
flight  officer  examination  to  be  used  toward  the  discrimination  of  officer 
material,  not  a  great  deal  of  research  was  directed  toward  the  specific 
officer  problem.  In  view  of  the  general  importance  of  the  problem  to 
the  Air  Corps,  it  would  seem  that  much  more  attention  should  be  given 
to  it. 

Motivation  Measures 

Motivation,  including  interests  and  attitudes,  was  mentioned  as  a  sig¬ 
nificant  trait  in  connection  with  all  three  assignments  in  training  and 
in  combat.  The  subject  has  revcral  aspects  which  were  treated  in  chapter 
26,  and  several  types  of  instruments  were  tried  out  to  meet  the  apparent 
needs. 

The  student’s  own  exprssion  of  his  preferences  for  training  and  of 
his  strength  of  interest  in  the  types  of  training  given  him  were  used 
along  with  aptitude  scores  to  determine  classification,  until  the  last  year 
of  the  war.  These  expressions  of  interest  failed  to  correlate  with  gradua¬ 
tion-elimination  to  a  practical  degree  except  for  the  navigator.  For  the 
navigator,  also,  expressions  of  interest  correlated  substantially  with 
valid  navigator  tests  and  with  the  navigator  staninc.  It  is  believed  that 
the  navigator’s  ratings  of  interest  were,  therefore,  made  in  a  more  en¬ 
lightened  manner,  with  greater  self-knowledge  of  potentialities.  That 
this  made  the  ratings  themselves  more  valid  is  an  interesting  hypothesis. 

Tests  that  relatively  objectively  assess  a  factor  of  pilot  interest — the 
General  Information  (p  lot  score)  and  possibly  the  Biographical  Data 
Blank  (pilot  score) — were  valid  to  a  practical  degree.  No  factor  of  navi¬ 
gation  interest,  as  such,  emerged.  Interests  in  academic  work,  more  par- 
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ticularly  in  mathematics  and  numerical  work,  probably  served  a  function 
in  the  selection  of  navigator*  comparable  with  *'  at  of  pilot  interest  for 
the  pilots.  These  interests  may  have  been  summarized,  or  incorporated 
in  the  mathematical  background  factor  found  in  the  Biographical  Data 
Blank  (navigator  score)  and  in  the  Mathematics  A  lest. 

Interests  and  attitudes  of  fighter  and  bomber  pilots  proved  to  be  suf¬ 
ficiently  divergent  and  stable  that  it  was  possible  to  assess  them  by  means 
of  satisfaction  tests. 

Tests  designed  to  assess  combat  readiness  did  not  show  significant 
valuation  against  the  training  criterion.  The  internal  consistency  was 
generally  low.  If  this  supposed  trait  is  to  be  measured,  it  is  recommended 
that  a  new  start  be  made. 

Biographical  Data  Blankf 

The  outstanding  success  in  deriving  valid  scoring  keys  for  this  type 
of  information  is  proof  enough  of  the  utility  of  this  kind  of  test  (sec 
ch.  27).  Furthermore,  it  has  unique  contributions  to  make.  The  pilot 
validity  for  the  pilot  score  is  due  in  large  part  to  the  variance  in  me¬ 
chanical  experience,  a  component  that  could  well  be  dispensed  with  in 
the  blank.  There  is  much  in  the  pilot  validity  that  is  yet  to  be  accounted 
for,  and  no  hypothesis  for  it  has  as  yet  been  accepted.  ' 

The  question  of  how  much,  if  any,  validity  of  the  scores  is  affected  ( 
by  guileful  falsification  has  apparently  been  answered  satisfactorily  by  { 
means  of  an  experimental  study  reported  in  chapter  27.  | 

There  is  now  little  doubt  of  the  need  for  this  type  of  instrument  in  I 

the  classification  battery.  Scoring  keys  were  in  the  process  of  con-  j 

struction  for  flight  engineer,  and  it  is  likely  that  they  could  also  be  prof-  ; 

itably  constructed  for  radar  observer  and  other  specialists,  including  the 
bombardier.  Possibilities  of  new  items  for  pilot  and  navigator  are  real, 
as  late  item  validations  have  shown.  It  is  probably  desirable  to  keep  any  I 

such  instrument  up  to  date  by  periodic  revaluation  of  items.  Once  the  i 

secrets  of  validity  are  unfolded,  as  in  the  finding  that  the  pilot  score  , 

depended  much  upon  the  mechanical  factor  for  its  validity,  tests  may  be 
constructed  to  cover  the  same  trait  more  objectively.  I 

Other  Job-Analysis  Categories 

The  listing  of  tests  above  docs  not  exhaust  the  job-analysis  categories. 

The  reasons  will  be  clear  as  they  arc  mentioned.  Estimation  of  speed  is 
not  a  trait  amenable  to  printed  testing ;  it  requires  apparatus  presentation 
and  control,  and  is  best  adapted  to  ihe  motion-picture  medium.  The 
sense  of  sustentation  obviously  requires  apparatus.  Speed  of  decision  and 
reaction  was  subjected  to  testing  first  by  means  of  the  Discrimination  Re¬ 
action  Time  test.  The  analysis  of  that  test  shows  that  its  validities  for  til 
three  air-crew  poMiiniiN  can  !«.-  almost  fully  accounted  for  by  known  fac¬ 
tors  without  resort  to  a  decision  factor  or  a  speed  factor.  There  is  much 
nonchance  variance  in  the  test  still  unaccounted  for,  however,  and  a 
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small  part  of  its  pilot  validity.  Thurstonc’s  finding  of  a  spcctl-of-decision 
factor  (1)  calls  for  further  exploration  of  this  area.  The  types  of  tests 
in  which  he  found  it,  however,  were  very  different  from  the  Discrimina¬ 
tion  Reaction  Time  test.  An  effort  was  made  to  duplicate  this  test  in 
printed  form  (sec  eh.  19),  hut  whether  it  will  carry  all  the  factors  pres¬ 
ent  in  the  apparatus  test  remains  to  be  seen. 

The  job-analysis  traits  mentioned  under  the  heading  of  coordination 
and  technique  are  suggestive  of  apparatus  tests.  No  printed  tests  were 
even  suggested  to  measure  them:  coordination,  appropriateness 'of  con¬ 
trols  used,  feel  of  controls,  smoothness  of  control  movement,  and  prog¬ 
ress  in  developing  technique. 

Among  the  temperament  categories,  absence  of  tenseness  has  gener¬ 
ally  been  regarded  as  an  apparatus-test  subject.  By  the  direct  observa¬ 
tional  methods  involving  ratings  of  tenseness  during  performance  on 
psychomotor  tests,  little  success  was  achieved.  Ratings  varied  so  much 
from  observer  to  observer  and  from  task  to  task  as  to  justify  little 
confidence  in  the  validity  of  the- ratings.  Absence  of  confusion  and  nerv¬ 
ousness  has  also  been  regarded  as  subject  matter  for  apparatus  tests. 
Direct  observations  seemed  to  give  promise  of  validity  in  pilot  selection, 
but  objective  scores  on  the  task  were  even  more  promising  and  it  is  un¬ 
known  whether  the  ratings  added  anything  new.  The  two  were  highly 
correlated.  The  validity  of  both  might  represent  variance  in  coordination 
or  skill  rather  than  in  a  temperament  variable.  The  category  of  suitable 
temperament  is  indeed  ambiguous.  It  can  be  interpreted  broadly  enough 
to  encompass  all  other  temperament  categories.  Until  letter  defined,  it 
is  of  no  use  as  a  guide  to  test  ideas. 

THE  SELECTION  OF  BATTERY  TESTS 

It  may  be  of  value  to  record  here  some  recommendations  as  to  the 
compilation  of  future  test  batteries  for  whatever  purpose  they  may  be 
needed  but  more  particularly  for  air-crew  classification.  There  may  well 
be  differences  of  opinion  on  theory  and  procedure  as  well  as  upon  specific 
recommendations.  The  ones  made  here  are  not  by  any  means  the  only 
ones  that  could  be  made.  They  represent  the  result  of  the  experiences  of 
many  investigators  but  the  biases  of  a  single  interpreter. 

Much  discussion  was  held  in  the  later  mouths  of  the  program  concern¬ 
ing  alternative  batteries.  Alternative  batteries,  or  at  least  replaceable  tests, 
are  most  desirable  when  retesting  is  called  for.  During  the  early  months 
of  the  program,  adherence  to  the  rule  of  no  retesting  was  strict.  After 
excombat  returnees  began  coming  in  large  numbers  and  applying  for 
air-crew  training,  retesting  of  them  was  made  an  important  exception 
to  the  general  policy.  Another  reason  for  alternative  instruments  is  the 
matter  of  possible  cc aching,  either  by  those  who  had  been  through  the 
tests  or  by  an  outside  agent  who  has  somehow  discovered  the  nature  of 
the  tests.  Experience  did  not  lead  to  great  concern  on  this  point.  The 
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battery  was  very  long  and  quite  varied  in  content.  Many  tests  were  not 
very  adaptable  to  coaching  and  others  went  through  revision  of  content 
from  time  to  time.  At  any  rate,  the  question  of  what  tests  are  inter¬ 
changeable,  and  the  existence  of  a  large  number  of  experimental  tests 
that  have  never  been  used  in  the  classification  battery  both  call  attention 
to  the  need  for  decisions  as  to  alternate  tests  and  possibly  as  to  alternate 
batteries.  Having  made  these  decisions  in  advance,  a  new  battery  could 
be  put  into  use  on  shorter  notice. 

The  recommendations  to  follow  will  not  go  beyond  tests  that  have 
been  factor  analyzed.  Again,  the  knowledge  of  factorial  content  and  of 
validities  of  factors  will  be  an  important  source  of  guidance.  It  seems 
obvious  to  the  writer  that  the  central  principle  to  be  followed  in  setting 
*  up  any  battery  would  be  to  note  first  the  factors  that  have  variance  in 
the  criteria  to  be  predicted  and  their  relative  amounts,  then  to  select  the 
best  tests,  in  terms  of  strength  and  purity  in  those  factors.  Noting  the 
factors  with  positive  validity  for  the  pilot,  in  table  28.17  for  example, 
one  might  then  go  to  table  28.15  and  under  each  factor  find  the  best  test 
of  it  for  pilots,  taking  into  consideration  its  other  substantial  loadings, 
if  no  pure  test  is  available.  Especially  to  be  avoided,  if  possible,  would 
be  the  inclusion  of  a  test  with  a  secondary  or  tertiary  loading  in  a  factor 
that  has  negative  validity  for  the  criterion  in  question. 

In  what  follows,  no  attempt  will  be  made  to  propose  complete  bat¬ 
teries.  Instead,  each  factor  will  be  considered  in  turn,  and  the  applicability 
of  the  best  tests  of  the  factor  to  the  selection  of  air  crew  will  be  men¬ 
tioned.  Lack  of  information  concerning  validities  of  many  of  the  factors 
for  bombardier  and  navigator  will  prevent  a  completely  satisfactory  cov¬ 
erage.  In  some  instances  tests  will  be  recommended  for  navigator  selec¬ 
tion  contingent  upon  the  later  finding  that  the  factor  is  a  component  of 
the  navigator  criterion.  Very  few  references  will  lie  made  to  the 
bombardier,  since  very  little  is  known  faclorially  regarding  the  bombar¬ 
dier  criterion,  nor  is  there  much  prospect  that  much  will  be  known. 
Listed  under  each  factor  category  will  be  the  recommended  tests  and 
following  them  the  specialties  fer  which  recommended,  if  the  factor  is 
valid.  Following  the  list  some  qualifications  may  be  given.  Code  numbers 
do  not  specify  the  exact  form  of  the  test,  since  different  forms  are  usu¬ 
ally  factorially  alike,  and  it  is  also  presumed  that  new  forms  will  be  de¬ 
veloped,  with  improvements  where  called  for.  The  tests  arc  mentioned 
in  approximately  the  order  of  choice,  though  in  many  instances  they  are 
indistinguishable  as  to  probable  value. 

Carefulness  Tests 

Plotting  Test,  CE452  (wrongs  score),  for  navigator. 

Complex  Scale  Reading,  CE454  (wrongs  score),  for  navigator. 

Rotting  Accuracy,  CE453  (wrongs  score),  for  navigator. 

Note. — The  third  in  the  list  is  not  quite  as  pure  as  the  others.  Its  sec¬ 
ondary  numerical  variance  should  make  it  more  valid  for  the  navigator. 


Integration  I  Tests 
Signal  Interpretation,  CI656,  for  navigator. 

Combat  Planes,  0655,  for  navigator. 

Fligh.  Formations,  0654,  for  pilot  or  navigator. 

Note. — Signal  interpretation  contains  too  much  general-reasoning  and 
integration  III  variances  to  be  a  good  test  for  pilots.  Combat  plants  con¬ 
tains  too  much  of  the  same  two  factors,  plus  some  unwanted  verbal 
variance. 

Integration  II  Tests 

Following  Directions,  CP402,  for  pilot  or  navigator. 

Code  Analysis,  CI653,  for  navigator. 

Note. — The  choice  for  pilots  is  the  lesser  of  two  evils.  Both  tests 
contain  some  verbal  variance,  but  Code  Analysis  also  has  numerical  and 
general-reasoning  components,  and  what  is  probably  wor.se,  a  substantial 
amount  of  variance  in  integration  III. 

Integration  III  Tests 

Note. — No  test  in  this  group  is  called  for  in  pilot  selection,  owing  to 
the  possible  negative  pilot  validity  of  this  factor.  Code  Analysis,  CT653, 
is  probably  best  for  navigator,  if  the  factor  has  navigator  validity.  Plan¬ 
ning  a  Course,  0406,  would  be  a  suitable  second  choice,  since  its  other 
substantial  loading  is  in  spatial  relations,  which  is  known  to  be  valid  for 
the  navigator.  If  the  planning  factor  is  found  to  have  navigator  validity, 
then  Planning  Air  Maneuvers,  CI408,  would  be  desirable.  If  Reasoning 
III  has  navigator  validity,  then  Figure  Classification,  0213,  would  be 
suitable.  One  defect  of  integration  tests  is  that  all  arc  complex.  A  good 
hypothesis  is  needed  regarding  the  nature  of  integration  III  and  new 
tests  to  maximize  that  type  of  variance. 

Judgment  Tests 

Practical  Judgment  (nonmcchanical),  0301,  for  pilot. 

Cemmonsense  Judgment,*  AAI'QE  JR  Part  3,  for  pilot  or  navigator. 
Practical  Estimations  I,  0308,  for  pilot. 

Note. — The  first  of  the  three,  Form  CI301BX1,  had  low  verbal  vari¬ 
ance  but  substantial  general-reasoning  variance.  The  second  has  probably 
more  vcibat  variance  than  is  good  for  a  pilot  test.  The  third  (A  form) 
had  zero  verbal  variance,  and  its  mechanical-experience  and  planning 
components  are  valid  for  the  pilot. 

Length  Estimation  Tests 
Shorter  Line,  CP606,  for  pilot  or  navigator; 

Nearest  Point,  CP607,  for  pilot. 

Note.— The  first  is  practically  pure,  and  is  recommended  even  though 
other  tests  have  a  greater  loading  in  the  factor.  The  second  has  small 
amounts  of  visual-memory,  spatial-relations,  visualization,  and  \erbal 

•  tv,,  ,Ml  wu  -a  wi»k  Uw  !»<««  t*»u  .a  th*p«r  l 


869 


(negative  loading)  components,  all  of  which ‘would  add  to  the  pilot 
validity  of  the  test.  Prohab  /  better  than  either  will  be  the  Estimation  of 
Length  test,  CP531,  but  its  composition  is  as  yet  unknown. 

Mathematical  Backgroitwl  Tests 
Biographical  Data  Blank  (Navigator  score),  CE602,  for  pilot  and 
navigator. 

Note. — Mathematics  A  is  not  recommended  because  of  its  length  and 
its  strong  numerical  variance.  It  is  significant  that  mathematical  back¬ 
ground  apparently  has  validity  for  the  pilot,  whereas  the  numerical  fac¬ 
tor  has  not.  Such  a  distinction  could  not  have  been  forecast  from  job- 
analysis  information,  or  from  test  results,  probably,  without  a  factorial 
analysis.  The  only  disadvantage  of  the  recommended  test  lies  in  its 
slight  negative  loading  with  pilot  interest 

Mechanical  Experience  Tests 
Mechanical  Information,  0905,  for  pilot  and  navigator. 

Tool  Function,  0906,  for  pilot  and  navigator. 

Note. — Too!  Function  had  a  slightly  higher  loading  in  this  factor  in 
one  analysis  than  Mechanical  Information  had  in  an  average  of  several 
analyses.  This  finding  needs  verification  before  much  attention  is  paid  to 
it  Mechanical  Information  is  apparently  more  pure,  though  the  second¬ 
ary  loading  of  perceptual  speed  in  Tool  Function,  if  real,  would  not  de¬ 
tract  from  either  pilot  or  navigator  selection,  and  could  probably  be  re¬ 
moved  in  later  forms. 


Paired  Associates  Memory 

Memory  for  Plane  Silhouettes,  CI503,  for  pilot  or  navigator. 

Memory  for  Ships,  Cl 504,  for  pilot  or  navigator. 

Note. — Both  tests  have  secondary  loadings  in  perceptual  speed  and 
spatial  relations,  both  of  which  are  valid  for  pilot  and  navigator.  The 
second  has  less  verbal  variance  and  might,  therefore,  be  better  for  pilots. 

Visual  Memory 

Plane  Formation,  CP805,  for  pilot  or  navigator. 

Note. — New  forms  of  this  test  would  undoubtedly  be  better.  Map- 
memory  tests  are  not  recommended,  because  there  is  probably  too  much 
verbal  component  for  the  pilot,  and  more  paired-associates  memory  and 
perceptual  speed  than  should  be  tolerated. 

Memory  III 

Plane  Name  Memory,  0506,  for  pilot  or  navigator. 

Memory  for  Landmarks,  0510,  for  pilot  or  navigator. 

Note. — Both  tests  would  cover  both  memory  I  and  memory  III  quite 
well.  They  arc  otherwise  pure  except  for  a  small  amount  of  perceptual 
speed.  They  should  be  rid  of  memory  I  if  possible.  This  seems  unlikely, 
unless  a  better  hypothesis  of  the  nature  of  memory  III  appears. 


Numerical  Tests 

Numerical  Operations,  0701,  for  bombardier  and  navigator. 

Note.- -There  is  no  close  comjietitnr  either  in  terms  of  strength  or  of 
purity.  It  is  easy  to  find  numerical  variance  combined  with  either  gen¬ 
eral  reasoning  or  with  spatial  relations  in  other  tests.  The  combination 
with  reasoning  would  be  satisfactory  for  the  navigator  but  superfluous 
for  the  bombardier. 

Perceptual  Speed  Tests 

Speed  of  Identification,  CP610,  for  pilot,  navigator,  or  bombardier. 
Spatial  Orientation  I,  CP501,  for  pilot,  navigator,  or  bombardier. 

Pilot  Interest  Tests 

General  Information  (Pilot  score),  CE505,  for  pilot 

-  Planning  Tests 

Planning  Air  Maneuvers,  0408,  for  navigator. 

Practical  Estimations  I,  0308,  for  pilot 

Note. — The  first  test  would  be  very  valid  for  the  navigator  if  both 
the  planning  and  integration  III  factors  arc  valid.  There  is  too  much  of 
both  verbal  and  integration  III  in  it  for  a  good  pilot  test.  The  second 
test  combines  planning  with  judgment  and  mechanical  experience,  both 
components  of  the  pilot  criterion.  , 

Psyciiomotor  Coordination 

Rotary  Pursuit,  CM8Q3,  for  pilot. 

Complex  Coordination,  CM701,  for  pilot. 

Note. — A  better  test  of  the  psychomotor  I  factor  could  be  produced. 
Psyciiomotor  Precision  Tests 

Discrimination  Reaction  Time,  CP611,  for  bomliardicr  or  navigator. 
Note. — No  really  good  test  of  this  apjwrcntly  is  available.  Fortu¬ 
nately,  this  test’s  stronger  loading  in  spatial  relations  contributes  to  valid¬ 
ity  for  both  assignments.  s 

Psyciiomotor  Speed  Tests 

Log  Book  Accuracy  (no  code),  for  navigator. 

Marking  Accuracy  (no  code),  for  navigator  or  pilot. 

Note.— The  first  test  has  a  strong  secondary  loading  in  the  numerical 
factor  and  so  is  not  recommended  for  the  pilot.  The  second  has  a  second¬ 
ary  loading  in  perceptual  speed  and  so  would  do  for  either  assignment 

General  Reasoning 

Mathematics  B,  CI206,  for  navigator. 

Fo;  ccd  Landing,  CI652,  for  navigator. 

Pattern  Comprehension,  CP803,  for  navigator.  ^ 

Note.— The  first  U  it  combines  about  equal  loadings  in  reasoning  and 
numerical  factors,  bu.  the  facto.:  are  about  equally  strong  ia  the  crite- 
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rion.  The  second  test  hr.  a  substantial  secondary  variance  in  integration 
II.  If  that  factor  shoul  be  found  correlated  zero  or  negative  with^  navi¬ 
gation,  the  second  test  out  of  the  running.  The  third  test  carries  small 
loadings  in  pcrcep'ual-s  ced,  verbal,  and  visualization  factors,  all  of  which 
do  no  particular  harm  >r  the  navigator. 

Reasoning  II 

Gottschaldt  Figures,  QP901A,  might  be  recommended  for  pilot  or 
navigator,  but  not  enough  is  known  regarding  other  nonchance  variance. 
It  is  probably  a  very  complex  test. 

Note. — There  is  no  good  recommendation  here.  Figure  analogies  is 
equally  strong  for  the  factor,  but  is  exceedingly  complex,  having  factors 
with  zero  or  negative  relation  to  the  pilot  criterion  and  factors  with  un¬ 
known  relation  with  the  navigator  criterion.  A  much  better  test  could 
be  developed.  Something  along  the  line  of  Pattern  Reasoning  of  the  judg- 
ment-and-reasoning  battery  is  recommended  as  a  starting  point. 

Reasoning  III 

Note. — Only  two  tests  compete  for  this  list,  and  both  are  too  com¬ 
plex,  with  secondary  loadings  whose  validities  are  either  zero  or  negative 
for  the  pilot,  and  are  largely  unknown  for  the  navigator. 

_  4 

$  Space  I,  Spatial  Relations  Tests  * 

Instrument  Comprehension  II,  CI616,  for  pilot 

Complex  Cooi  dination,  CM701,  for  pilot 

Dial  and  lable  Reading,  CP621,  622,  for  navigator  and  bombardier. 

Discrimination  Reaction  Time,  CP611,  for  bombardier  and  navigator. 

Note.  An  abundance  of  good  tests  here  makes  it  possible  to  dis¬ 
criminate  among  the  specialties  to  which  the  test  is  best  suited,  since  the 
factor  is  a  component  of  all  three  criteria.  Secondary  variances  in  the 
first  test  are  reasoning  II,  visualization,  with  a  small  verbal  admixture. 
For  the  second  test  the  chie  secondary  component  is  psychomotor  co¬ 
ordination,  which  is  of  no  use  apparently  for  either  bombardier  or  navi¬ 
gator.  The  third  test  presents  a  strong  numerical  component  and  some 
perceptual  speed,  both  of  which  are  not  amiss  for  bombardier  as  well  as 
navigator.  Ihc  last  test  has  the  strongest  secondary  variance  in  psycho¬ 
motor  precision,  for  which  the  pilot  has  no  apparent  use. 

Space  II  Tests 

Hands,  CPS  12,  for  pilot  or  navigator. 

Note.— A  rather  pure  test  for  the  factor.  It  is  believed  that  the  new 
test  developed  in  the  program  will  be  even  better,  that  is,  Position  Orien¬ 
tation,  CP526. 

T  Space  III  Tests 

Plotting  Test,  CE455  (Rights  score),  possibly  for  navigator. 

Not®- — This  factor  is  of  such  uncertain  status  that  it  is  probably 
premature  to  recommend  any  test  for  it  The  factor's  pilot  validity  as 


indicated  by  limited  early  results  calls  for  nQ.tc$t  of  this  factor  for  pilots. 

Social  Science  Background 

Note.  No  test  will  be  recommended,  o\fring  to  the  uncertain  defini¬ 
tion  of  this  factor.  Certainly,  the  two  tests  with  communulity  in  it,  Geog- 
raphy  and  History,  have  stronger  verbal  variance  than  they  have  in  this 
factor.  The  negative  plot  validity  estimated  for  the  factor  is  of  interest, 
but  hardly  calls  for  attempts  to  measure  it  with  negative  weights  in  the 
composite.  On  general  principles,  the  negative  weighting  of  any  trait 
score  means  adverse  selection  for  that  component;  although  that  com¬ 
ponent  may  be  negatively  related  to  piloting  a  plane,  it  might  be  positively 
related  to  success  as  a  plane  commander. 

Verbal  Tests 

Vocabulary,  CI604,  for  navigator  and  perhaps  for  bombardier. 

Technical  Vocabulary  and  Information  (navigator  score),  CE505C, 
for  navigator. 

Note. — Both  tests  are  practically  pure  for  the  factor.  For  face  validity 
it  might  be  better  to  use  the  second  test  in  preference  to  the  first.  Read¬ 
ing  Comprehension  is  much  too  complex  to  be  recommended. 

Visualization  Tests 

Mechanical  Principles,  0903,  for  pilot  or  bombardier. 

Spatial  Visualization  I,  0204,  for  navigator  or  bombardier. 

Pattern  Comprehension,  CP803,  for  pilot  and  bombardier. 

Mechanical  Movements,  0904,  for  pilot. 

Directional  Plotting,  CE455,  for  navigator. 

Note.— A  wealth  of  strong  visualization  tests  exists — none,  however, 
pure.  The  mechanical  tests  mentioned  in  this  list  give  large  portions  of 
remaining  nonchance  variance  to  the  mechanical  factor  which  can  do  the 
tests  no  harm  for  pilot  selection.  The  Spatial  Visualization  I  test  is  recom¬ 
mended  for  navigator  in  spite  of  its  complexity  in  factors  whose  validity 
for  navigator  are  unknown.  The  test’s  validity  for  navigator  is'so  high 
that  not  much  risk  seems  to  he  taken.  The  Directional  Plotting  test  is  one 
to  watch  in  connection  with  visualization,  since  both  rights  and  wrongs 
scores  are  loaded  witl  it.  The  rights  score  also  carries  some  numerical 
and  space  III  variance,  the  latter  being  still  a  question  mark;  the  wrongs 
score  carries  some  ca  j fulness  variance,  which  is  presumed  to  be  related 
to  the  navigator  crite  !on,  but  that  fact  must  still  be  established.  These 
qualifications  would  pobably  justify  the  striking  of  this  test  from  the 
list  pending  further  f  ata.  It  is  believed  that  tests  developed  very  late  in 
♦he  war  will  prove  .  >  be  much  purer  for  the  visualization  factor.  Their 
analysis  is  an  impc :  ant  step  remaining  to  be  done. 

SOME  GfiNER  . ,  EXPERIENCES  AND  RECOMMENDATIONS 

It  would  be  \yi<!  aui  a  sense  of  completeness  to  bring  to  a  close  this 
volume,  leaving  ti^nentioned  some  of  the  more  general  lessons  learned 
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awl  the  implications  wIv.m  came  from  the  rich  experiences  involved  in 
the  printed-test  program .  Many  of  the  fruits  of  these  experiences  have 
been  presented  in  rhapMs  2  and  3,  in  connection  with  the  brief  account 
of  research  plans  and  p  •  ycedurcs.  Others  have  been  pointed  out  or  im¬ 
plied  in  chapters  on  test  -  as  the  opportunity  arose.  In  the  paragraphs  to 
follow,  some  of  these  p:ints  will  be  repeated  for  emphasis  and  by  way 
of  summary.  Some  new  points  will  be  brought  out  because  they  arise 
from  viewing  the  results  in  perspective  and  in  retrospect.  Many  of  them 
arc  not  new  ideas;  they  have  simply  been  emphasized  by  the  pressure  of 
repeated  and  massive  experience.  Others  emerge  from  the  systematic 
approacli  through  factor  theory  and  factorial  results.  It  is  believed  that 
those  who  shared  in  the  experiences  all  recognize  the  problems  entailed 
in  the  points  to  be  mentioned.  In  places  there  may  be  some  divergencies 
of  interpretation  and  recommendation.  It  must  be  remembered  that  the 
following  account  is  mediated  primarily  through  one  observer. 

Factorial  Batteries 

Experiences,  as  recorded  in  the  preceding  chapter,  in  particular,  tend 
to  strengthen  the  conviction  that  there  are  a  limited  number  of  stable 
reference  variables  in  both  tests  and  criteria  in  terms  of  which  test  bat¬ 
teries  can  best  be  adapted  to  vocational  predictions.  The  ideal  test  bat¬ 
tery,  then,  should  provide  variances  in  all  the  factors  that  are  significantly 
weighted  in  pertinent  job  criteria.  , 

The  traditional  method  of  extracting  a  single  predictive  score  from 
a  composite  of  several  tests  has  been  based  upon  the  multiple-regression- 
cquation  principle,  each  part  of  the  composite  being  optimally  weighted 
by  least-square  fit  in  order  to  maximize  the  multiple  correlation  be¬ 
tween  composite  and  criterion.  It  is  common  knowledge  that  the  lower 
the  intcrcorrclation  of  parts,  the  greater  the  multiple  correlation.  Factors 
have  been  found  to  be  either  independent,  i  e.,  uncorrclated,  or  to  have 
small  positive  intcrcorrclations.  Pure  tests  of  factors  would  accordingly 
yield  maximum  independence  among  parts  of  a  composite.  There  are 
other  ways  of  arriving  at  tests  with  unique  contributions ;  any  two  tests 
with  zero  intercorrelation  coupled  with  validity  of  both  for  a  criterion 
will  satisfy  this  demand  statistically.  The  fact  that  two  tests  correlate 
zero  with  one  another,  however,  docs  not  assure  that'  they  are  unique 
with  respect  to  a  third  test,  nor  even  that  they  are  factorially  pure.  Even 
a  complete  battery’  of  nonovcrlapping  tests  might  be  arrived  at  by  trial 
and  error.  It  is  believed  that  the  shortest  route  to  this  goal  is  through 
factor  analysis  and  factor  hypotheses.  It  is  also  maintained  that  the  ap¬ 
proach  is  decidedly  more  meaningful  and  that  this  reduces  the  amount 
of  trial  and  error  to  a  minimum. 

The  classification  battery  in  use  at  the  close  of  the  war  included  19 
different  tests  which  yielded  21  diffetent  scores.  The  best  evidence  avail¬ 
able  indicates  that  this  battery  covers  12  known  factors,  plus  two  addi¬ 
tional  ones  not  identified  (the  hypothesized  kinesthetic-motor  factor  in 


the  rudder  control  test  and  a  completely  unidentified  factor  in  the  Bio* 
graphical  Data  Blank).  The  number  of  factors  that  have  positive  loadings 
in  the  pilot  criterion  appears  to  be  about  20;  17  that  appear  in  table 
28.17  plus  the  2  mentioned  above  and  1  unidentified  in  the  experimental 
test  Memory  for  Tactical  Plans.  The  navigator  and  bombardier  criteria 
are  very  incompletely  known,  but  the  present  knowledge,  as  represented 
in  table  28.14,  would  indicate  that  a  classification  battery  for  the  three 
assignments,  requires  at  Vast  four  additional  factors.  The  battery  just 
referred  to  (probably  accounts  for  but  14  of  the  some  25  factors  that 
should  be  covered,  not  to  mention  a  number  of  temperament  factors 
that  have  not  even  been  brought  to  light. 

The  coverage  of  25  distinct  factors  would  require  at  least  25  tests  or 
scores,  even  with  pure  tests.  The  reason  why  21  scores  account  for  only 
14  factors  is  the  great  amount  of  replication  of  coverage  of  the  same 
factors,  in  some  instances,  by  as  many  as  4  or  5  different  tests.  It  is  true 
that  some  factors  arc  paired  in  tests  in  a  way  that  meets  the  specifica¬ 
tions  required  by  similar  combinations  in  criteria.  An  example  of  this 
is  the  combination  of  numerical  and  general-reasoning  factors  in  the 
Arithmetic  Reasoning  test,  and  another  is  the  spatial-relations  and  psycho- 
motor-coordination  factors  in  Complex  Coordination.  The  difficulty  is 
that  the  relative  weightings  may  not  be  optimal  and  that  in  sonic  other 
criterion  the  two  factors  may  not  be  similarly  paired.  For  a  general- 
purpose  battery  that  goes  beyond  three  assignments,  the  needs  for  test 
purity  are  even  more  apparent.  There  seems  no  escaping  the  fact  that 
*ny  vocational  battery  requires  a  large  number  of  tests  for  complete  cov¬ 
erage,  even  at  a  minimal  practical  level  of  prediction.  There  are  other 
reasons  for  large  numbers  of  tests  in  batteries  as  the  following  discus¬ 
sion  will  also  show. 

Maximizing  Discriminations  in  Classification 

Implied  in  the  previous  discussion  is  not  only  the  goal  of  achieving 
maximal  selection  for  each  assignment  but  also  that  of  making  the  max¬ 
imal  differential  selection,  that  is,  optimal  classification.  This  problem 
has  not  received  the  analytical  (in  the  mathematical-deductive  sense) 
treatment  that  it  deserves.  It  is  probably  true  that  when  the  first  of  these 
two  goals  is  approached  the  second  is  also  nearer  fulfillment,  particularly 
for  those  individuals  who  are  qualified  for  one  assignment  and  disquali¬ 
fied  for  others.  Of  thore  who  are  qualified  for  more  than  one  assign¬ 
ment,  however,  there  still  remains  the  problem  as  to  which  of  two  as¬ 
signments  is  the  better.  The  lower  the  disqualification  rate,  the  more 
serious  this  becomes.  ^ 

Criteria  of  job  success  arc,  of  course,  not  mutually  independent.  Tliey 
possess  factors  in  coinn  -ii  also.  Being  themselves  much  more  complex 
than'  most  tests,  their  de  .rce  of  independence  might  be  exjx-cted  to  range 
somewhat  lower  than  tint  for  tests,  even  relatively  complex  tests.  We 
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have  no  direct  evidence  as  to  jus'  now  independent  the  criteria  for  bom¬ 
bardier,  navigator,  and  pilot  tra  ling  arc.  There  were  insufficient  num¬ 
bers  of  individuals  who  attempt'd  more  than  one  of  these  types  of  train¬ 
ing.  Even  if  there  had  been  larje  numbers,  the  conditions  for  estimating 
the  correlation  among  criteria  •  ould  have  been  rather  short  of  satisfac¬ 
tory.  Very  rough  estimates  of  these  intercorrelations  have  been  made 
from  the  known  factor  compoi  .nts  given  in  table  28.14.  The  correlation 
between  any  pair  of  criteria  ./ould  be  given  by  the  sum  of  the  cross 
products  of  pairs  of  factor  loadings.  With  this  method  as  the  basis, 
using  the  incomplete  information  in  table  28.14,  we  find  that  no  iritrr- 
correlation  exceeds  0.20.  The  intcrcorrclations  between  composite  apti¬ 
tude  scores  for  the  various  classification  batteries  were  typically  above 
0.50.  In  only  one  instance  was  any  one  as  low  as  the  region  of  0.25,  and 
that  was  between  the  navigator  and  pilot  stanincs.  One  reason  for  the 
failure  to  achieve  the  proper  degree  of  independence  was  the  fact  of  im¬ 
pure  tests.  Another  was  that  each  composite  was  set  up  independently  of 
the  others.  An  analytical  solution  to  the  problem  would  have  made  pos¬ 
sible  the  derivation  of  weights  for  any  one  composite  taking  into  account 
the  goal  of  maximal  discrimination  in  prediction. 

Even  if  pure  tests  had  been  utilized,  however,  the  intercorrelations  of 
the  stanincs  would  not  have  been  a  minimum,  owing  to  the  intcrcorrela- 
tion  of  error  variances  in  tests.  A  test  score  cannot,  unfortunately,  be 
weighted  in  a  composite  without  also  weighting  its  error  variance.  Devi¬ 
ations  due  to  this  source  are  in  the  same  direction  in  all  composites  in 
which  the  score  is  a  part.  The  best  solution  seems  to  be  to  use  indepen¬ 
dently  derived  measures  of  each  factor  in  different  composites.  This  may 
call  for  more  extra  work  tlian  is  justifiable;  the  cost  in  extra  testing 
time  and  material  may  be  too  great.  It  would  call  for  two  or  more  dif¬ 
ferent  tests  or  scores  for  each  factor  that  is  weighted  in  more  than  one 
composite.  There  Was  enough  of  such  duplication  in  recent  classification 
batteries  to  permit  this  procedure,  and  in  one  or  two  instances  this  type 
of  discrimination  was  utilized.  In  a  battery  of  relatively  pure  tests,  how¬ 
ever,  cither  two  or  more  tests  of  each  factor  would  be  called  for  in  some 
instances,  or  the  same  test  could  be  given  in  separately  timed  parts,  each 
score  being  used  independently.  The  Numerical  Operations  test  is  already 
scored  twice  (front  and  back  of  an  answer  sheet).  The  parts  would  yield 
scores  of  lower  reliability  than  would  the  two  combined,  but  it  is  be¬ 
lieved  that  the  goal  of  reliability  is  of  less  importance  than  that  of  factor 
coverage,  a  point  that  will  be  discussed  in  the  paragraphs  following.  The 
writer  is  definitely  inclined  to  a  large  and  varied  battery  of  short,  even 
though  less  reliable,  tests  in  preference  to  a  restricted  battery  of  longer 
and  more  reliable  tests.  In  the  same  time  interval  allowed  for  testing, 
there  is  probably  much  more  to  be  gained  by  the  use  of  a  more  compre¬ 
hensive  lottery,  even  if  something  must  be  sacrificed  in  terms  of  reliabil¬ 
ity  of  single  tests. 
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Reliability  and  Factor  Variance 

So  much  importance  has  been  given  traditionally  to  the  matter  of  test 
reliability  that  some  comments  are  needed  here  in  defense  of  the  state¬ 
ments  just  made.  Arguments  of  a  nonanalytical  sort  will  be  offered  to 
support  those  statements.  A  little  common-sense  reasoning,  based  upon 
some  fundamental  theory  about  the  factorial  composition  of  tests,  should 
suffice  for  this  purpose. 

In  factor  theory,  the  total  variance  of  a  test  score  is  conceived  as  be¬ 
ing  composed  of  a  number  of  independent  component  variances,  addi- 
tively  combined.  In  terms  of  an  equation, 

»i,=au,+fljj,+Oji*+ . +«<*+<%*  (29.1) 

in  which  tq*  is  the  total  variance  of  of  test  I,  an9,  a,;*.  etc.,  arc  variances 
of  factors  1,  2,  etc.,  in  test  I,  iq*  —  any  unique  variance  there  may  be  in 
the  test,  and  ef  —  the  error  variance  in  the  test.  The  reliability  of  the 
test,  r*i,  is  the  sum  of  all  the  nonchance  variances,  in  other  words,  the 
sum  of  all  terms  in  the  right-hand  side  of  the  equation  except  the  last. 

The  other  side  of  the  picture  is  the  contribution  of  each  factor  to  the 
validity  of  a  test.  The  fundamental  equation  for  the  validity  of  a  test  in 
terms  of  its  common-factor  loadings  is  stated  on  page  839,  but  is  repeated 
here  in  modified  form  to  apply  to  test  I : 

riP=alipl+atipt+anf>i+ . P»  (29.2) 

in  which  is  the  validity  of  test  I  for  pilot  training,  On,  o*j,  etc.  arc  load¬ 
ings  of  factors  1,  2,  etc.,  in  test  I,  and  pu  pt,  etc.,  arc  loadings  of  factors 
1,  2,  etc.,  in  the  pilot  criterion. 

It  is  fairly  obvious  from  equation  (29.2)  that  of  the  values  in 
equation  (29.1)  only  the  nonzero  loadings  On,  <*»i,  etc.,  have  any  bearing 
upon  the  size  of  the  validity  coefficient.  They  do  also  contribute  to  the 
reliability  of  the  test,  as  can  be  seen  from  equation  (29.1).  Dut  there 
is  considerable  freedom  for  rti  to  vary  independently  of  them. 

Let  us  assume  that  in  test  I  only  loadings  on,  on,  Oci,  and  an,  are  posi¬ 
tive  and  that  of  these,  only  an  and  att  are  positive  in  the  pilot  criterion. 
The  validity  of  test  I  depends  therefore  upon  these  two  factors  and  their 
loadings  in  test  and  criterion.  From  the  first  equation,  it  can  be  seen 
that  r,t  could  change,  cither  increasing  or  decreasing,  without  affecting 
the  validity  of  the  test,  if  that  change  is  brought  about  by  changes  of 
variances  c*i\  «. or  h  «t*.  The  reliability  could  even  shrink  to  the  sum 
of  a^+aji*  without  a  iccting  the  validity  of  test  I.  Even  relatively  pure 
tests,  if  the  single  facto  -  leadings  squared,  at i*  docs  not  equal  r„,  there  is 
room  for  loss  of  rcliabi  ty  without  loss  of  validity.  It  is  not  known  what 
the  typical  change  in  trior  variances  is  when  there  is  a  change  in  rij. 
It  may  usually  mean  a  vroportional  change  all  along  the  line.  Hut  it  is 
deemed  possible  by  prop:r  control  to  devise  new  forms  of  tests  in  which 
variances  of  common- factor  conqioncnts  can  be  increased  or  decreased 
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independently  and  a  vill.  Experience  in  the  program  lead  to  the  belief 
that  we  ?rc  cntcrin  a  stage  tn  test  construction  where  that  refinement 
will  not  be  uncomne  i. 

In  this  connccti<  t  it  is  pertinent  to  cite  one  or  two  dramatic  instances 
in  which  unusuall'  low  reliability  has  been  accompanied  by  substantial 
validity.  One  is  tl. :  Path  Length  test,  CP628B,  whose  reliability  was  esti¬ 
mated  as  0.25  (aV.rnatc-forms)  and  whose  pilot  validity  was  0.23.  An¬ 
other  is  the  Biop  ■  phical  Data  Blank,  CE602D,  navigator  score  whose 
reliability  was  cs  imated  as  0.35  (split-half)  and  0.49  (test  retest)  and 
whose  navigator  '-alidity  was  0.23.  A  15-itcm  judgment  test  (part  III 
of  the  Air  Corps  Qualifying  Examination,  AC10A)  had  a  reliability  of 
0.36  (odd-even)  «'  nd  a  pilot  validity  of  0.36. 

From  these  cit<;J  results  it  looks  as  though  the  common  advice  that  if 
a  test  shows  validi  y  one  can  forget  about  its  reliability  might  be  sound; 
at  least  in  some  trsts.  To  see  whether  this  type  of  case  might  be  more 
general,  a  correlation  was  computed  between  the  best  estimates  of  reli¬ 
ability  and  of  pilot  validity  for  74  tests.  No  test  was  included  unless  an 
alternate- form  type  of  reliability  was  available  or  an  odd-even  type  in  a 
dearly  power  test.  The  coefficient  was  almost  exactly  zero.  There  may 
be  constant  errors,  such  as  certain  types  of  test,  like  vocabulary  and 
mathematics  tests,  which  have  low  validity  but  high  reliability,  that  load 
the  situation.  A  better  controlled  test  of  the  matter  could  be  made,  but 
the  fact  is  cited  for  what  it  is  worth.  Examination  of  lists  of  reiiabii.’ties 
of  intellectual  and  perceptual  tests  described  in  this  volume  will  show 
that  they  range  from  about  zero  to  0.97,  with  a  median  of  about  0.80, 
the  distribution  being  markedly  skewed.  Were  we  to  hold  out  for  a  high 
minimum  reliability  lcvd,  many  a  useful  test  would  not  be  utilized. 

The  question  as  to  whether  the  variance  components  of  a  test  that  are 
not  valid  for  a  criterion  should  be  in  the  nature  of  error  variance  or  of 
variance  in  other,  but  nonvalid,  factors,  is  an  open  one.  We  may  assume 
for  the  sake  of  argument  here  that  there  is  a  real  dioicc  in  the  matter; 
that  with  the  valid  factor  variance  held  constant,  the  reliability  may  equal 
that  variance  or  it  may  be  substantially  greater  by  the  addition  of  other 
common-factor  variance  none  of  which  contributes  to  validity.  In  the  first 
alternative,  all  n-  ivatid  variance  of  a  test  is  given  over  to  error  variance ; 
and  in  the  seconu,  the  nonvalid  variance  is  divided  between  irrelevant  but 
nonerror  variance  and  error  variance.  What  will  be  the  different  effects 
upon  the  contribution  of  this  test  to  a  composite  under  these  two  con¬ 
ditions?  There  are  probably  several  effects,  but  one  of  them  is  that  the 
test  with  secondary  and  perhaps  tertiary  factor  variances  will  to  that 
extcut  be  selective  in  the  directions  of  those  additional  factors  whether 
we  want  that  kind  of  selection  injected  into  the  composite  or  not.  If  all 
the  nonvalid  variance  were  error  variance,  no  change  of  direction  of 
selection  would  be  entailed.  Front  this  line  of  thinking,  therefore,  it  would 
Seem  preferable  to  use  a  less  reliable,  pure  test  to  using  a  more  reliable, 


more  complex  test,  when  both  have  equal  projections  on  the  valid  factor 
that  we  wish  to  measure. 

Validities  of  Pure  Tests 

In  the  search  for  valid  pure  tests,  one  finding  is  disturbing  to  the 
investigator  who,  following  the  traditional  teachings  on  test  construction, 
works  toward  maximizing  the  validity  of  each  test.  If  the  latter  is  the 
sole  objective,. we  almost  always  end  up  with  complex  tests.  An  exception 
to  this  is  when  a  criterion  has  a  high  saturation  in  some  factor  or  factors, 
as  the  navigator  criterion  has  in  the  numerical  and  general-reasoning 
factors.  It  turns  out,  as  it  did  in  the  present  program,  that  the  most  valid 
tests  are  complex ;  Reading  Comprehension,  Arithmetic  Reasoning,  Me* 
chanical  Principles,  Biographical  Data  Blank  (pilot  score).  Dial  and  Table 
Reading,  and  the  Complex  Coordinator  test.  By  comparison,  pure  tests 
like  Vocabulary,  Speed  of  Identification.  Numerical  Operations,  and  Me¬ 
chanical  Information  have  suffered.  All  have  been  taken  out  of  the  classifi¬ 
cation  batiery  giving  way  to  other  tests  at  one  time  or  another  because  of 
lower  validity  coefficients.  In  the  absence  of  thorough  knowledge  of  intcr- 
corrclations  or  of  factor  loadings  and  their  validities,  the  temptation  is 
strong  to  do  that  very  thing.  In  the  light  of  such  knowledge,  all  of  these 
four,  except  the  vocabulary  test,  were  returned  to  the  battery. 

As  another  aspect  to  this  matter,  the  attempt  to  improve  a  test  by 
making  item-validity  studies  also  works  toward  complex  tests.  Any  item 
may  correlate  with  the  criterion  for  as  many  reasons  as  there  arc  factors 
in  common  between  them.  One  item  is  valid  because  of  factor  A,  another 
because  of  factor  B,  and  still  a  third  because  of  factor  C.  Or,  as  with 
total  test  scores,  an  item  that  is  itself  factorial!)-  complex  has  a  greater 
likelihood  of  exhibiting  significant  validity  and  so  of  being  retained. 
While  the  validity  of  the  total  score  is  thus  raised,  the  uniqueness  of 
the  test  is  not  thereby  promoted. 

A  Technique  for  Test  Purification 

Tests  have  been  improved  with  respect  to  validity  by  the  procedure  of 
item  validation  just  mentioned.  They  have  been  improved  with  respect 
to  internal  consistency  and  hcnc-  reliability  by  item  correlation  against 
total  scores.  They  should  also  be  subject  to  purification  by  a  similar 
process  of  item  selection  based  upon  item  correlation  with  criteria  of 
known  factors.  The  selection  can  !>c  both  positive  and  negative,  that  is, 
acceptance  of  items  that  correlate  acceptably  with  the  factor  to  be 
maximized  in  the  test,  and  the  rejection  of  items  that  correlate  to  an 
unacceptably  high  degree  with  other  factors.  An  arithmetic-reasoning 
test  might  be  made  more  of  a  reasoning  test  and  less  of  a  numerical  test 
by  rejecting  items  that  correlate  too  strongly  with  a  Numerical  Operations 
score.  Other  tests  may  b--  reduced  in  verbal  variance  by  correlating 
items  with  a  vocabulary  score.  Reading  Comprehension  might  be  rid  of 
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mcchr*.nical-c.\pericrv ^variance  by  correlating  items  with  a  Mechanical 
Information  score. 

This  technique  w  ;  given  a  trial  in  connection  with  the  test  Mechanical 
Principles,  in  an  eft',  t  to  segregate  the  mechanical  and  visualization  items, 
with  no  great  sue  <  ss.  The  difficulty  probably  lay  in  the  fact  that  the 
items  had  already  '  *  on  put  through  tests  of  internal  consistency  and  some 
of  them  through  ,n  item  validation,  which  as  was  said  before  works 
toward  complexity  To  be  most  effective,  the  purification  by  item  correla¬ 
tion  must  begin  e  ’ly.  It  would  be  wise,  in  constructing  the  items  ori¬ 
ginally,  to  have  h  pothescs  about  what  introduces  variance  of  one  kind 
or  another  into  ai  item  and  to  strive  for  the  kind  of  purity  one  wants. 
Awareness  of  whit  the  factors  are  and  in  what  kind  of  items  they  are 
likely  to  appear  is  \  great  help  in  this.  The  success  that  was  achieved  in 
keeping  verbal  var'  mce  out  of  a  number  of  tests  where  it  might  weil 
have  crept  in,  c.  g.(  Mechanical  Information,  some  of  the  integration  tests, 
etc.,  is  an  indication  of  what  can  be  done. 

Homogeneous  Versus  Heterogeneous  Tests 

During  the  course  of  test  development  in  the  program,  it  became  ap¬ 
parent  to  many  of  the  personnel,  as  it  had  been  known  from  the  begin¬ 
ning  by  others,  that  tests  fall  into  two  categories  as  determined  by  the 
homogeneity  of  their  content.  Homogeneity  may  be  of  different  kinds. 
Homogeneity  of  form  and  content  can  be  noted  by  superficial  examina¬ 
tion.  Homogeneous  items  from  this  point  of  view  will  look  alike  in 
certain  respects.  Homogeneity  of  function  is  detected  by  means  of  item 
intcrcorrelation  or  of  item  correlation  with  some  common  criterion,  such 
as  total  score  in  the  test  of  which  they  are  a  part. 

The  similarity  of  form  refers  to  the  technical  nature  of  the  item, 
whether  it  is  a  multiple-choice,  matching,  or  true-false,  etc.,  type.  Con¬ 
tent  similarity,  by  superficial  inspection,  refers  to  the  material  used — 
words,  diagrams,  forms,  machines,  colors,  and  the  like.  Each  test  is 
usually  consistent  in  form  and  in  content  throughout,  and  the  choice  of 
cither  is  determined  by  such  considerations  as  conveniences  of  group 
administration,  use  of  answer  sheet,  machine  scoring,  etc.,  and  also  by 
the  nature  of  the  task  which  it  is  believed  will  best  bring  out  individual 
differences  in  the  trait  being  measured.  While  these  technical  uniformities 
of  items  have  some  bearing  upon  the  factorial  composition  of  item  vari¬ 
ances  and,  therefore,  upon  the  functional  homogeneity  of  a  test,  there  is 
much  latitude  for  diverse  functional  nature  within  the  same  set  of  similar 
items.  It  is  the  functional  homogeneity  of  items  in  which  we  are  most 
interested  here.  The  degree  of  functional  homogeneity  is  indicated  by 
item  intercorrclaiions,  but  not  as  simply  as  one  might  think,  as  will  be 
shown. 

Item  analysis  in  which  the  criterion  of  homogeneity  of  an  item  with 
other  items  in  the  test  is  the  correlation  of  items  with  total  provisional 
score  on  the  test,  tends  undoubtedly  toward  increased  homogeneity  when 


the  provisional  score  is  itself  relatively  unambiguous,  that  is,  factorially 
pure.  When  the  provisional  total  score  is  itself  factorially  complex,  it 
can  he  seen  that  the  selection  of  items  having  greatest  item-total  cor¬ 
relations  will  not  necessarily  increase  homogeneity  in  the  factorial  sense. 
Item  validation,  in  which  the  basis  of  item  selection  is  correlation  of 
item  with  an  outside  practical  criterion,  is  also  likely  to  lead  toward 
greater  factorial  heterogeneity,  as  was  pointed  out  before. 

The  moral  of  this  discussion  is  that  if  one  desires  valid,  unique  tests, 
one  should  proceed  slowly  in  the  use  of  the  item-total  correlation  until 
one  has  a  fairly  unambiguous  total  score.  The  correlation  of  items  with 
a  job  criterion  may  be  used  very  well  as  an  exploratory,  preliminary 
step.  The  valid  items  should  be  scrutinized  in  order  to  derive  a  hypothesis 
as  to  their  validity.  Items  that  seem  to  fit  the  hypothesis  in  common  may 
then  be  used  as  a  cluster  for  deriving  a  provisional  total  score  to  be  used 
as  a  new  criterion  for  new  item  correlations,  of  items  within  the  du-ncr 
as  well  as  of  newly  constructed  items  which  also  appear  to  fit  the 
hypothesis.  In  developing  the  General  Information  test,  pilot  score,  for 
example,  having  found  that  the  factor  of  pilot  interest  is  the  unique 
variance  that  the  test  has  t«  <  ',  and  that  mechanical  experience  is 

another  strong  variance  but  well  covered  by  another  test,  one  should, 
according  to  this  line  cf  reasoning,  reject  items  that  correlate  highly  with 
the  Mechanical  Information  total  score  (better  yet,  transfer  those  items 
to  that  test),  then  make  an  internal-consistency  analysis  of  remaining 
items  with  a  total  score  based  upon  remaining  items  as  the  criterion. 
It  there  arc  then  some  items  with  low  consistency  but  still  with  significant 
pilot  validity,  effort  should  be  made  to  understand  the  nature  of  any 
other  valid  factor  tha*  may  be  represented. 

Apart  from  the  goal  of  highly  homogeneous,  unique  tests,  if  one  has 
a  test  that  is  obviously  heterogeneous  and  merely  desires  to  maximize 
its  validity,  the  route  would  not  be  through  increasing  internal  con 
sistency,  but  quite  the  opposite.  Applying  the  common  multiplc-regrcs- ^ 
sion  principles,  one  would  strive  to  maximize  the  correlation  of  each 
item  with  the  job  criterion  and  to  minimizj?  the  intcrcorrciations  of  the 
items.  The  selection  should  favor  items  of  the  type  that  will  bring  that 
about.  This  procedure,  however,  would  seem  merely  to  result  in  an  ex¬ 
tension  of  our  ignorance  to  new  valid  territory,  rather  than  to  increase 
our  knowledge  of  why  tests  arc  valid  and  therefore  to  improve  our  con¬ 
trol  over  validity  already  achieved. 

Power  Tests  Versus  Speed  Tests 

Not  a  great  deal  was  learned  concerning  the  relative  merits  of  power 
tests  and  speed  tests,  nor  what  effect  working  time  as  a  determining 
factor  has  upon  test  results.  The  problem  arose  many  times,  but  each 
particular  instance  of  it  was  met  as  it  occurred,  without  the  benefit  of 
any  new  general  principles  having  been  discovered  or  any  general 
studies  being  made  upoa  them.  Some  tests  were  administered  with  dif- 
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iVrcnt  time  limits  for  the  f.  me  material,  and  routine  reliability  and 
validity  studies  were  carried  through,  but  without  results  justifying  any 
statement  of  generalizations.  Only  one  or  two  rather  disjointed  comments, 
therefore,  can  be  offered  on  this  question. 

In  tests  in  which  it  was  desired  that  all  examinees  attempt  or  respond 
to  every  item,  the  device  of  "pacing"  the  group  was  found  useful. 
Reference  to  some  test  descriptions  in  earlier  chapters  will  show  that  one 
or  more  times  during  the  work  on  a  test  the  administrator  would  break 
in  with  a  statement  to  the  effect  that  “at  this  time  you  should  be  working 
on  item  number  X."  In  spite  of  this,  there  would  still  be  a  limited  number 
who  might  not  complete  a  test  in  the  allotted  time,  liberal  though  it  was. 
There  is  the  other  problem  of  the  many  who  complete  a  test  much  earlier 
than  the  given  limit  and  who  have  nothing  to  occupy  them  until  the  next 
test  is  called  for.  To  meet  this  situation,  it  is  recommended  that  many 
power  tests  include  more  items  than  arc  scored.  The  items  at  the  end, 
beyond  the  last  one  scored,  merely  provide  busy  work  for  the  rapid 
worker.  The  terminal,  busy-work  items  may  be  those  of  low  internal 
consistency  and  of  high  level  of  difficulty.  This  procedure  would  probably 
be  most  applicable  to  vocabulary  and  information  tests,  though  it  would 
work  with  others.  One  could  then  score  only  items  that  have  b<  :n  at¬ 
tempted  by  everybody,  or  down  the  list  to  a  point  where  any  desired 
percentage  has  attempted  the  last  item. 

If  it  is  not  known  beforehand  on  the  basis  of  soundest  theory  or 
empirical  evidence  whether  a  test  is  better  as  a  speed  test  cr  as  a  power 
test,  the  recommended  procedure  would  tenable  us  to  settle  the  point.  If 
all  individuals  attempt  items  through  *hc  nth  one,  it  would  be  desirable 
to  determine  the  validity  of  scores  derived  from  n  items,  n  +  5  items, 
i;  +  10,  n  +  15,  and  so  on  as  far  as  one  desires  to  carry  the  study. 
Allowances  for  relation  of  validity  to  test  length  would  need  to  be  made. 

The  time  problem  deserves  study  fr^m  another  aspect.  It  may  well  be 
that  the  validity  of  many  a" test  is  below  its  maximum  because  examinees 
are  themselves  too  much  in  control  of  their  working  time.  Pacing  of  the 
type  mentioned  may  help  to  overcome  this  to  some  extent.  Printing  short 
sections  of  tests  on  each  page  and  timing  the  test  by  pages  is  even  better. 
Even  more  precise  control  of  working  time  per  item  may  be  desirable 
in  some  tests.  This  suggests  either  tachistoscopic  or  motion-picture  presen¬ 
tation  of  items  in  which  the  most  stringent  control  can  be  attained. 
Empirical  studies  of  this  problem  are  needed, 

Righta  Versus  Wrongs  Scores 

The  reader  who  has  followed  the  discussion  of  even  a  small  number  of 
the  tests  in  this  volume  will  almost  certainly  have  noted  the  attention 
that  has  been  given  to  rights  and  wrongs  scores,  apart  from  their  men¬ 
tion  in  scoring  formulas.  Experience  has  repeatedly  called  attention  to 
the  importance  of  this.  In  a  surprisingly  large  number  of  tests  the  dis¬ 
persion  of  wrongs  scores  is  relatively  large,  offering  a  basis  for  measure- 


rnent  of  individual  differences.  There  is  also  sufficient  freedom  in  many 
tests  for  correlations  between  rights  and  wrongs  scores  to  depart  radically 
fromjjrl.OO.  There  is  also  the  striking  discovery  that  rights  and  wrongs 
scores  may  be  functionally  very  different,  giving  indications  of  individual 
differences  on  quite  different  continua  of  human  personality. 

The  implication  of  this  is  that  in  the  development  of  any  new  test, 
consideration  should  first  be  given  to  the  amount  of  correlation  btJwccn 
rights  and  wrongs  scores.  If  this  is  sufficiently  different  from  —1.00, 
let  us  say  —0.80  or  higher  on  the  scale  (when  corrected  for  attenuation), 
from  then  on  they  should  be  investigated  as  two  test  variables.  They 
should  be  validated  separately,  have  separate  determinations  of  relia¬ 
bilities,  and  each  should  be  treated  in  factorial  studies.  In  factor  analysis, 
it  would  be  best  not  to  include  both  in  the  same  matrix,  if  both  are  derived 
from  the  same  set  of  items.  A  rights  score  from  one  half  the  test  or  one 
form  and  wrongs  from  the  other  half  or  form  would  be  suitable,  avoid¬ 
ing  the  emergence  of  a  doublet  factor  unique  to  the  two.  When  these 
procedures  are  carried  out,  it  may  be  found  that  both  scores  are  called 
for  in  using  a  battery,  and  that  cither  or  both  should  be  weighted  in  one 
or  more  composite  scores.  I  f  both  should  be  called  for  in  the  same  com¬ 
posite,  it  is  recommended,  again,  that  they  be  derived  from  independent 
sets  of  items. 

Scoring  Formulas 

Closely  related  to  the  problem  of  rights  and  wrongs  scores  is  that  of 
scoring  formulas.  Experience  shows  that  the  automatic  and  indiscriminate 
use  of  correction-  for-guessing  formulas  is  to  be  severely  condemned. 
While  this  procedure  may  satisfy  the  logic  about  probabilities  of  chance 
success  with  an  item  by  guessing  (and  it  is  often  doubtful  whether  this 
logic  really  applies  to  the  case  for  which  it  was  intended,  as  was  pointed 
out  in  ch.  3),  it  may  at  the  same  time  have  quite  serious  effects  that  were 
not  suspected  and  which,  if  known,  would  not  be  desired. 

It  might  seem  that  the  solution  lies  in  deriving  optima!  scoring  formulas, 
giving  the  rights  unitary  weight  and  the  wrongs  a  weight  o,  which  will 
maximize  the  multiple  correlation  between  rights  and  wrongs,  additively 
combined,  and  the  job  criterion.  When  this  has  been  done,  the  a  priori 
formula  is  verified  less  often  than  it  is  not,  but  validities  of  tests  are 
"raised  very  little  by  change  in  weight  a.  For  10  different  tests  in  1  study, 
the  highest  gain  in  validity  for  optimal  weights  was  0.04,  and  in  most  of 
them  the  gains  were  too  low  to  be  of  practical  value.  As  compared  with 
validities  for  rights  only,  however,  the  optimal  weights  provided  increases 
from  0.03  to  0.06.  In  two  tests,  positive  weights,  one  as  high  as  +0.479, 
were  apparently  best  for  the  wrongs,  with  increased  validities  of  0.02 
and  0.03  over  those  derived  with  negative,  a  priori,  weights  fer  the 
wrongs. 

The  use  of  the  optimal-weight  scoring  formula  leads  to  the  conclusion 
that  v  ~y  large  samples  arc  required  for  the  estimation  of  stable  weights, 
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and  that  the  formula  is  extremely  sensitive  in  some  instances,  giving 
very  large  changes  in  a  for  small  changes  in  correlation  coefficients.  The 
wary  investigator  will  be  on  the  lookout  for  absurd  results  with  the 
formula  at  times. 

Cross  Validations  of  Composites 

Another,  more  general,  caution  in  connection  with  multiple-regression 
weights  is  well  worth  mentioning.  It  has  been  recognized  that  a  coefficient 
of  multiple  correlation  as  ordinarily  computed  represents  the  maximum 
amount  of  association  between  a  pool  of  variables  optimally  weighted 
and  a  single  criterion,  and  that  this  correlation  is  subject  to  some  shrink¬ 
age  when  the  weights  arc  applied  to  predicting  the  criterion  in  a  new 
sample  from  which  the  weights  were  not  derived.  The  extent  of  this 
shrinkage  is  sometimes  estimated  by  means  of  shrinkage  formulas.  These 
formulae  arc  expected  to  indicate  the  amount  of  regression  effect  to  be 
expected.  Shrinkage  formulas  were  rarely  employed  in  the  program, 
owing  to  lack  of  full  confidence  in  them.  Instead,  as  will  be  noted  in 
chapter  24  and  following,  a  cross-validation  procedure  was  invoked  to 
determine  empirically  whether  a  scoring  key  would  maintain  its  validity 
when  applied  to  a  new  sample.  The  results  obtained  by  this  procedure  well 
justified  the  decision  to  expend  the  necessary  effort.  The  amount  of 
shrinkage  is  often  surprising,  but  such  a  finding  leaves  one  with  the 
satisfaction  of  assurance  that  he  knows  the  worst.  The  experience  also 
leads  to  the  suspicion  that  many  a  prediction  composite  may  unwittingly 
rest  on  a  shaky  foundation.  Aside  from  cross  validation,  another  pro¬ 
cedure  that  can  be  used  is  to  test  regression  weights  for  statistical  signifi¬ 
cance  or  to  compare  weights  derived  in  two  halves  of  a  sample.  Having 
gone  this  far,  however,  the  cross  validation  entails  little  extra  effort. 

SOME  GENERAL  IMPLICATIONS 

The  preceding  pages  have  presented  a  few  of  the  many  suggestions 
that  emerge  from  experiences  encountered  in  developing  tests.  They 
tend  toward  the  more  technical  anu  statistical  type  of  problem  of  limited 
scope.  What  can  be  said  concerning  the  larger  vistas  of  human  mental 
measurement  that  surely  must  have  been  glimpsed  from  time  to  time? 
There  have  been  moments  when  the  vision  has  been  less  myopic  than  the 
detailed  accounts  of  this  volume  suggest.  It  is  hoped  that  the  mere  recital 
of  facts  about  test  after  test,  area  after  area,  has  also  provided  the  reader 
with  opportunities  to  share  in  the  outlook  that  first-hand  experiences 
have  offered. 

Those  who  have  been  close  to  practical  problems  of  vocational  selection 
or  vocational  guidance,  well  know  that  the  development  of  useful  tech¬ 
niques  has  been  {>ainfu!ly  slow  and  disheartening,  ami  that  the  final 
limitations  of  effective  prediction  may  have  seemed  to  be  very  great. 
The  ceiling  of  maximal  accuracy  of  prediction  may  have  seemed  to  be 


quit-  low.  If  this  is  so,  it  is  believed  that  the  measure  of  success  achieved 
by  the  program  which  this  and  similar  volumes  represent,  should  ma¬ 
terially  alter  the  outlook  for  vocational-adjustment  service  for  the  better. 
There  were  probably  very  few  at  the  beginning  of  the  program  who 
wo  ;ld  have  placed  much  hope  in  the  prospect  of  selecting  pilot  trainees  by 
means  of  printed  tests  alone  with  a  degree  of  accuracy  represented  by  a 
correlation  as  high  as  0.50.  That  much  accuracy  has  actually  been  accom¬ 
plished  by  means  of  a  battery  including  a  total  of  only  150  items,  as 
will  be  told  in  more  detail  in  another  Report.  Nor  would  one  have  ex¬ 
pected  that  by  means  of  a  longer  battery  which  includes  apparatus  tests 
a  validity  between  0.60  and  0  70  could  he  attained  for  pilot  selection, 
and  an  even  higher  validity  for  navigator  selection.  What  has  been  done 
for  these  two  occupations  can  be  done  for  others,  and,  as  may  be  seen 
in  »he  latter  part  of  chapter  28,  the  ceilings  for  pilot  and  navigator  have 
not  by  any  means  been  reached.  The  whole  area  of  temperament  is  very 
much  still  open  territory  in  this  connection. 

This  degree  of  success  has  not  been  accomplished  in  four  ordinary 
years,  or  with  ordinary  allotments  of  personnel,  or  with  ordinary  facilities 
and  subjects.  Only  the  crisis  of  a  world  war,  unfortunately,  could  permit 
it  to  come  to  pass.  Advances  in  other  vocational  areas  will  need  some¬ 
what  similar  concentration  of  efforts.  Fortunately,  much  that  has  been 
learned  in  the  Army  Air  Forces  will  readily  apply  elsewhere.  Much  cl 
it,  to  be  sure,  will  not  apply  without  further  research ;  but  the  groundwork 
has  been  laid.  Much  more  needs  to  be  done  in  fundamental,  not  immedi¬ 
ately  practical,  research.  The  great  richness  of  human  talent  and  tem¬ 
perament  has  been  emphasized  as  never  before.  The  limitations  of  the  IQ 
and  the  PQ  have  been  thrown  into  bold  relief.  A  society  that  wants  a 
useful  and  dependable  vocational  assignment  of  its  personnel  must  be 
ready  to  support  the  research  that  is  required  to  satisfy  that  desire. 

The  volume  should  not  be  closed  without  pointing  out  the  fact  that 
the  program  was  a  highly  cooperative  affair.  It  may  be  regarded  as  an 
example  of  cooperative  research  and  of  what  can  be  accomplished  when 
trained  individuals  with  common  purposes  and  efforts  tackle  technical 
problems  in  the  democratic  way.  There  is  probably  no  single  contribution 
of  which  it  strictly  can  be  said  that  it  is  exclusively  the  brainchild  of 
Captain  X  or  it  is  the  test  constructed  by  Sergeant  Q.  Ideas  were  sub¬ 
mitted  to  group  discussion;  test  items  to  critics;  plans  were  laid  in 
conference  ;  decisions  yerc  reached  by  agreement ;  specialized  technicians 
each  had  a  hand  in  the  finished  product.  Things  did  not  always  progress 
as  smoothly  as  this  account  may  imply,  but  the  essential  idea  was  develop¬ 
ment  of  tests  and  projects  as  socialized  ventures  and  the  constant  inter¬ 
play  of  criticism  and  rebuttal.  In  this  manner  many  a  potential  mistake 
was  undoubtedly  caught  early,  and  the  plan  was  the  richer  because  of 
the  multiple  contribution.  Because  of  the  high  proportion  of  creative 
work  in  research  as  such,  and  because  psychological  research,  in  particu- 
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lar,  profits  relatively  more  by  a  pooling  of  individual  impressions,  the 
cooperative  approach  has  much  to  offer. 

Having  made  these  remarks  about  cooperation  in  general,  it  should 
also  be  said  that  all  credit  and  the  gratitude  of  those  who  have  benefited 
or  will  benefit  by  the  results  of  their  efforts  are  due  to  the  many  indi¬ 
viduals  who  loyally  and  unsparingly  devoted  themselves  to  one  of  the 
greatest  adventures  in  human  engineering.  v 
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APPENDIX  K 


Mathematical  Rationale  for 
Chapter  3 


In  the  derivations  to  follow,  a  number  of  simplifying  assumptions  are 
made  which  the  reader  may  consider  inappropriate.  It  is  true  that  a  con* 
dition  such  as  equal  item  variances  will  never  be  met  in  practice.  The 
chief  usefulness  of  these  formulas,  however,  is  to  give  the  test  con¬ 
structor  a  rapid  way  to  predict  statistics  on  some  level  other  than  the  one 
on  which  he  is  working.  The  error  introduced  is  small  compared  with 
the  computational  convenience  of  the  simplified  formulas.  Compare,  in 
this  respect,  the  Kudcr-Richardson  formulae  20  and  21. 

DEFINITION  OF  SYMBOLS 

xi  =  a  deviation  score  in  item  i.  It  may  take  on  various  meanings: 
Xi,  x„  x,  *  *  *  X,,  respectively. 

n  —  the  number  of  items  in  the  test.  _  , 

a4  =  the  standard  deviation  for  item  i.  at  =  VP*<1«- 
pi  ss  the  proportion  of  the  individuals  who  answer  item  i  correctly. 
Correct  answers  arc  given  a  score  of  +1  and  wrong  answers  or 
omissions  a  score  of  zero. 

9«  =  1  “  Pl¬ 
ot  =  the  standard  deviation  of  the  total  test  score. 
rif  =  a  product-moment  correlation  (phi  coefficient)  between  any  two 
items, » and  /. 

rt,  —  a  product-moment  correlation  (point  biserial)  between  item  i  and 
the  total  test  score. 

rtt-  a  reliability  (internal-consistency)  coefficient  for  the  total  test 
score. 

r,«  as  the  correlation  between  the  total  test  score  and  an  outside  criterion. 
r4«  =  a  point-biserial  correlation  between  item  i  and  an  outside  criterion. 
7  =s  an  average  (mean)  of  a  number  of  coefficients  of  correlation. 

DERIVATION  OF  EQUATIONS 


Part  I. — Item  Intercorrelations 


The  correlation  between  an  item  and  the 
is  given  by  the  equation  r**  =  r)(wJ+W|  + 
m  lXi*  +  SxtXt  4-  Sxi',  +_ 

N7u 


total  score,  including  the  item, 
. +jr(+ . +xm)  — 


Dividing  tlirc  gh  by  N«ri, 

(2)  _  <ti  +  r*,cr j  +  ^,<7,  + .  +  rua* 

Hi - 

<r» 

Assuming  cqur.  variances,  i.e.,  constant  difficulty  for  all  items, 

M)  a  — 1  _  _  n— 1 

'  't  +  X  rif  V piQi  +  V^i?i  S 

_ i=i  _ i=i 

f“  ~~  Ol  ~  9t 

Rearranging  .  id  dividing  through  by  (n  —  l)  we  have  the  mean  cor¬ 
relation  bctwcci  item  i  and  all  the  remaining  items  in  terms  of  the  cor¬ 
relation  between  item  i  and  the  total,  the  standard  deviation  of  the  sum, 
and  the  mean  stardard  deviation  of  the  items: 

(4)  V  r  — 

1=1  _  rua,  —  yj piqi 

n  —  1  “  (»—  1)  VMi 

Summating  for  all  items  and  dividing  by  the  number  of  items, 

(5)  n  n— 1  n  - 

X  X  m  <u  X  rit  —n\Jp{q{  __ ,nrr 

i=i  j=i  _  i=i _ —y Pigi 

w(n-l)  n(»  -  1)  VMi  («  “  1)  VMi 
Formula  (S)  gives  us  the  relationship  between  the  mean  product- 
moment  correlation  between  items  (phi  coefficient)  and  the  mean  product- 
moment  correlation  between  item  and  total  score  including  the  item  (point 
biscrial),  the  standard  deviation  of  the  total  score,  and  the  mean  item 
standard  deviation. 

The  variance  of  a  sum  is  given  by  the  equation : 

(6) 

<J*  =  »i*  +  <r,*  + . +  *•*  +  2ruo,<r,  +  ......  +2r(Il.X)  ncr<M_t>  a» 

Assuming  equal  item  variances,  or  constant  item  difficulty, 

n  n-l  b  n—  1 

(7)  it }•'  =  Mci*  +  a/  2  X  rii  —  nPiqt  +  Piq «  X  2 

i=i  i=i  1=1  J=i 

*  B— I 

Substituting  an  expression  for  X  2  derived  from  (5),  we  have 

1=1  J=i 


fft*  —  npiqi  +  Piqi 


—  VPiQi 

(n  — 1)  VMi 


~]  (n-l)H 


Canceling  and  assembling  terms, 

(9)  eri*  =  ny/ptqt  o~it 
Dividing  both  sides  by  »», 

(10)  <r«  =  ny/p^iUt 

Substituting  the  erorcssion  in  (10)  for  <r<  in  equation  (5)  and  simplify¬ 
ing,  wc  obtain  0  -d  relationship  between  the  mean  interitem  cor¬ 
relation  (phi  co*.  and  the  mean  correlation  between  items  and 

total  score  (point  biserial ):  _ 


ru  ~ 


(«  -  1)VM» 


«?*«,  —  1 

h-\ 


Pari  II.— 1  Internal  Consistency  of  Total  Scores 

Substituting  the  expression  in  (10)  for  in  the  Kudcr-Richardson 
formula  number  21, 

(12) 


n 


_ r'u  —  nf i<y, 

»  ~  1  **piqCr*u 


I 


Canceling  and  multiplying, 

(13)  rH  =  — ~  1 _ 

***••  “  »  -  1  r*ii(»  -  1) 

The  internal -consistency  coefficient  of  the  test  as  a  whole  is  thus  stated 
in  terms  of  the  mean  product-moment  correlation  between  items  and 
total  score  (point  biserial). 

Part  m.— Spurious  Item-Test  Correlation 
If  r%%  equals  zero,  then 

n  _  1 


(M) 


"“1  >*»,(* —1) 

Rearranging,  and  canceling, 


(15) 

Taking  the  square  root, 

(16) 


= 


7lt=  -L 
v» 


In  other  words,  the  mean  product-moment  correlation  between  item 
and  total  score  including  the  item  (point  biserial)  when  inter-item  cor¬ 
relations  are  zero  is  inversely  proportional  to  the  square  root  of  the 
number  of  items  in  the  test 


Part  IV .—Validity  of  Items  and  of  Total  Score 

The  correlation  between  an  outside  criterion  and  the  total  ter!  score 
given  in  terms  of  item  scores,  reads :  r,«  =  r  (*!+  z,+ . +4r«)c  = 

(17)  XsiC  -f  lx*  4- . 4- X.rie  4- . -f  Xxj: 

N*  i<r« 

Dividing  the  last  term  through  by  N*Cf 

(18)  __  rlf4rt  4-  rtto t  4- . -f  r<t«t  -f . 4-  r^. 


Assuming  equal  variances,  or  equal  item  difficulties, 
(19)  r,t  =  \/piql  1  r«, 

_ t=i 

•i 

Substituting  for  oi  the  expression  in  (10), 


(20) 


r  i,  — 


VPtqi  »ru 

*y/P*qt  r«  * 


rOJJ  JO-47— M 
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The  validity  of  the  total  test  score  is  thus  stated  in  terms  of  the  indi¬ 
vidual  item  validities  an<.  tlvsitcm  point  biserials  with  total  score. 
Canceling  terms, 

(21)  r,.=b- 

In  other  words,  when  items  are  of  the  same  level  of  difficulty,  or  nearly 
so,  the  validity  of  the  total  score  is  the  ratio  of  the  average  (mean)  item 
validity  to  the  average  (mean)  correlation  bc.wecr.'  item  and  total  score.1 


1  fkii  appendix  wi*  written  by  Cape  Lloyd  G.  Humphrey*. 
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APPENDIX  8. 


Factor  Loadings, 
Commonalities,  Reliabilities, 
and  Validities  for  Printed 
Tests  Grouped  Alphabetically 
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AppevjIX  B — Factor  loadings,  communalilies,  reliabilities, 


Teal  -»nd  Code  No. 


Aerial  Photograph*, 

QP90IA-IV . 

biographical  Data, 

C 1X021) . 

(Navigator) 
biographical  Data, 
Chr>02D  (Pilot) . 

block  Counting, 

CP5!2A . 

Code  Analysis, 

CI053AX2 . 

Combat  Planea, 

CIC55AX5 . 

Competitive  Planning, 

CM09AX2 . 

Complex  Beale  Read¬ 
ing.  CE454A . 

(Total  Right*) 
Complex  Beale  Read¬ 
ing.  CE454A . 

(Total  Wrong*) 

Cubes.  CP512A . 

Decoding.  C1214AX1 
Decoding,  C1214AX2. . 

Dial  Reading.  CP622A. 

Dial  A  Table  Reading, 
CP821A.  01*822 A. . . 


Directional  Orienta¬ 
tion.  CP415B . 

Dirrctionrl  Plotting. 

CE435A . 

(Tout  Right*) 
Directional  Plotting, 

CK455A . 

(Total  Wrongs) 
Driving  Skill, 

CI307AX1 . 

Figure  Analogies, 
C1212AX1 . 

Figure  Clarification, 

CI213AXI . 

Fls-s.  Figure*,  Card*. 

CP312A . 

Fli-bt  Formations, 

C165IAX3 . 

Following  Direction*, 

CIM02A . 

Following  Oral  Direo- 
tions.  Cltl'il AX3 _ 

Forced  landing*, 

CIO  52AX4 .  ... 

Cemml  Information, 

CK5n.il) . 

(Navigator) 

Ceorrol  information, 
CK'sVit)  il'ilnt)  .  ... 
Ceiural  Information 
( I'ech.-Vwab.- 
ltomb  ),  CE.VOJC. . . . 


General  Information, 
(Tr»  h.*Vocab,Nav.), 
CK505C . 


(general  Information 
!Tcr!i.-Vocab.-l*»l ), 
CtoOjC . 


•.  VJa.  '  - 


Nr 

C a 

Ii 

I* 

h 

j 

LE 

MB 

ME 

M, 

PM 

VM 

Mi 

392 

{ 

06 

IS 

3000 

42 

-04 

3000 

29 

50 

302 

04 

38 

200 

03 

40 

42 

00 

-07 

268 

37 

12 

28 

10 

-03 

372 

33 

38 

-02 

334 

03 

334 

37 

638 

06 

00 

03 

-13 

-10 

-03 

202 

00 

1900 

-06 

-17 

392 

08 

8000 

.... 

IS 

-02 

392 

08 

36 

334 

-03 

334 

41 

202 

21 

46 

408 

08 

04 

34 

11 

-05 

202 

38 

392 

00 

15 

2G4 

46 

14 

21 

13 

-02 

208 

11 

34 

10 

05 

-03 

208 

18 

23 

18 

09 

07 

208 

te 

38 

n 

-03 

03 

3000 

27 

02 

3000 

00 

30 

3000 

15 

3434 

03 

03 

01 

oo 

14 

10 

3COO 

39 

.  . 

N 

P 

PI 

27 

39 

•  •  a  # 

10 

10 

-14 

-20 

14 

-09 

-13 

43 

a  •  a  . 

29 

08 

.... 

22 

02 

.... 

13 

03 

•  a  #  » 

52 

00 

18 

31 

12 

36 

•  e  •  . 

.... 

31 

.... 

50 

27 

.... 

33 

31 

11 

22 

07 

44 

08 

17 

08 

.... 

20 

17 

.  a  •  . 

05 

01 

•  •  •  * 

-03 

31 

09 

11 

25 

09 

21 

08 

-02 

07 

20 

10 

20 

-10 

23 

38 

04 

08 

33 

14 

10 

00 

-08 

17 

34 

Appendix  B 


Tfrt  ind  Ode  No. 


Ni  c«  I*  !• 


Gecgrophy.  • .  *9®° 

Oottsehaldt  Figure*. 

qpooia-iu . 

iiiitory,  A8153 .  1900 

Instrument  Ompre- 
hension  1,  — I015A. .  468 

Instrument  Compre- 
hension  II,  CtflioB. .  468 

'  Judgment  of  Prcpor- 

lion.,  CP200B .  562 

Log  Hook  Aeeuraey . . .  -J® 

“asr*-. . «» 

M8iS;?S7: .  m 

"ssbe- . 

"a  ssssr. . «• 

“?,»•  m 

Map  I’Utvning. 


LB  [MB  ME 

777TI  oi  12 


M K  PM  Vm!-V!,I  N  PIPI 


l  i  1-03 


:::  » -os .... 

08  -08  . . 


-07  21  08....  02. 

11  -02  ....  13 

. -09  06  IS 

06  06  00....  02 

91  ....  -09  ....  30 


22  18  ... . 
17  13  ... . 

...  08.... 

10  18.... 


13  03  03  . 1  09  17.... 

IS....  02  ...  13....  02  22.... 

02  . .  —06  . .  32  13  •  •  •  • 


04....  09  01... 


09  C4  . . . . 

41  S2  05 
14  88  07 


M‘A7SS-  m  .  11  “ .  18  - 

"8». .  a . u-u-s  “-o'-::  ::::::::  8  8::: 

Marking  Accuracy. . . .  206  ....—07  01  07  - 

Mathematica  A.  .-04  .  42  07  -1 

C1702B .  3000  . ! . . . 


Mathematica  A. 

CI702P.... .  3000  1 

Me'hematiee  B, 

C1206B .  «20 


Mathematica  B.  <*—«a  . .  ...  m  m  —01 

CI206C . .  3266  13  04  00  16  ...  •  «* 

Mathematica  B.  .  nr  04  04 

C1206B,  CI700A. . . .  3960  .  02  04  04 


371  071. ..  - 1 -  51  <» 

...  -04  091  161  14....  00 


..-01  12  lrf....  <»-01 

m  na  08  ...  id...  67-01 


Mechanical  Functiona, 

CI907A. . . . . .  153 

Meehanieal  Informal- 
tioo,  CI60SA .  3791 


.  05  05 

01  -09  -03  00  071 


74  ....  ...1...  -08  00 


Mechanical  Move-  j 
meats,  CI0O4A.  ....I  153 
Mechanical  Piindplea.  I 
CI003A . . .  7385 


-04....  38 


Mechanical  Principle#,  , 

CI003B . . .  354  17| 

Memory  for  Land- 

marks.  C1510AX1. . .  417 

Memory  for  Plana- 

Names.  C1606AX1. .  238 

Memory  for  Plane  Kl- 
hnuettes,  CIS03AX1.  417 
Memory  for  Ships. 

CI504AX1 .........  238 

Memory  for  Tactical 
Plana.  C1509AX ....  179 

Nearest  Point-Point 

Uiatance,  CP607B. . .  545 

Number  Series  Com- 

plelion.  C12I5AX1.  202 

Numerical  Operations, 

C1701U  (Back) .  6266 


43 

20  ... . 

.... 

....  07  08  04  .... 

-03 

-05 

04  011  021—09  . . . . 

11 

-06 

1 

....  37. 

..  02.... 

-07  08 

..  02.... 

-13  01 

61  20  44 

....  16 

58  02  61 

....  29 

56  06  -09 

....  34 

50  06  20 

....  29 

10  - 12  ... . 

....  -02 

...  23.... 

-04  09 

47  05.... 

81  04  OS 

78  10  02 


Numerical  Operation*, 
C1701B  (Total) . 


02  14 


86  08.. 


Twi  and  Cod*  No, 


Organisational  Plan¬ 
ning.  CI407AX1 .... 
Organiutlonal  i 'Ina¬ 
rm*.  CI4C7BX1. ... 
Pato  L*n*th.  CT428B. 
PatUrn  Analyiis, 

CP612A.  . . . 

PatUra  Assembly. 

CPbOtA . 

PatUra  Coeopicbeo- 

»ioa,  CPMMA . 

PatUra  Cotnerchoa- 
•ior.  CP803AX1 . . . . 

Phytic*.  ClhOl A . 

Phytietl  Priodploa. 
ClbOlUX . 

Picture  InUjratioo, 

CPiOtA . 

Planning  Air  Mama 
vers,  CltOSAXi. . . . 
Pianola*  Air  Manta- 
nn,CI40HAXl. . . . 
Planning  A  Cireait, 

C1401A . 

Pianola*  A  CoafM, 

CI4MAXS . 

Plot. in*  Accuracy, 

CE4i3A...rT . 

(Total  RighU) 

Plot  tin*  Amina, 

CE4AIA . 

(Total  Wrong*) 

Pto4  tin*  Test.  CKtm. 

(Total  RichU) 

Plot  lias  Test.  CEAAJA 
(Total  Wrongs) 
Practical  Fatimatiens 

I.  CI30MAXI . 

Practical  Fatirar-tioaa 

II.  CI308AX1 . 

Practical  lodgment 

(Mechanical  Itotaa), 

C1301BXI . 

Practical  Judgment 
(Non-Mechanical). 

C1301BXI . . 

Practical  Judgment  I 
(Noo-Mcchaai*aO, 

C130IBX3 . 

Practical  Judgment  11 
(Work  Ptaa), 

CU0IBX3 . 

Pwrawt-Path  Tracing. 
CPA  12  A 


Resdiaj  Csca  peahen 
non.  C  It  I4Q . 


lUadjnf  Comrrshon 

non.  Cltltll . 

Rsuts  Ptanaiau. 

C141IAXI . 

Requeues  at  Maaoa 

▼era.  CUIOA . 

Shorter  liae-lJa* 
Length*.  CPGOgfl  . 


Shorter  Path-Path 
lXjtaaes,  CPOOaB 


Rif  cal  Interpretation. 


CIO.VJA 

Bpaoal  Urn  a  U  lion  L 

CPuOIA 


pstial  Orientation 
CPWIB . 


pat  la)  CVUaUtioa  It 
CPA03B . 


MR 


IRS 


7S03 


£  8  S  2  §13  3  3  S  3  3  E  S  E  EE3SSSI  KUIBI 


TmI  and  Cod*  No.  Ni  I  Co 


Spatial  Rraeoniof, 

C13U0X1 .  404 

Spatial  VUuali  ration 

T.CI204AX1 .  203 

8patial  Viiualiiatloa 

I.C1204AX3 .  20fl 

Spatial  Viaualliatloa 

it,  CI203AXI .  203 

Speed  o 4  Identification 
(Non-Rota  lad). 

CP010A . 

8 pead  of  Identification  I 
(Rotated),  CP01OA.  .1  7049 


Tabla  Reading. 
CP03IA . 


Tool  Function, 

C100AA .  .IM 

Vocabulary,  CI004B. . .  1900 


■Decimal  pointa  have  boos  omitted.  , 

•Validitiaa,  nod  other  alalia  tiea  are  weighted  averagee.  Sot  chapter  38  for  eaptanaUoo. 
•Derived  from  combi  natiooe  of  data  oa  aimilar  forma 


v  - 
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Standard  leviations  of  65 
Selected  Tests  in  Samples  of 
Unclassified  Aviation  Students 
at  Sheppard  Field,  Texas1 
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A 

Ability  to  Listen  in  Noise,  CP/LiBX? 
4-V438 

Activities,  active  vs  sedentary,  348 
Adaptability  Rating  (or  Military  Aero 
nautics  (ARMA),  60S,  667  f. 

Aerial  Landmarks: 

CP525A,  S3S  f. 

CPS2SB.  C,  536 
Aerial  Orientation: 

CP520A,  486-488 
CP520B,  C,  488 

Aerial  Photographs,  QP901A  IV,  4il 
Aggression,  ratings,  666  L 
Aiming  Stress  Test,  CE211A,  804 
Aircrew  Preference  Rating  Scale 
CE503A,  B,  732  f. 

Airplane  Formation  Memory,  CI513A 
243  f. 

Ambiguous  Figures,  CI316A,  141  f. 
Ambiguous  Ink  Blots,  CI317A,  140  f. 
Angle  Estimation,  CP218A,  469-471 
Angular  Judgment,  CP2I7A,  468  f. 
Angular-judgment  tests,  468-472 
evaluation,  469.  471,  &12,  474  f. 
validity,  460 
Appearance: 

and  interview  results,  668 
and  pilot  training,  668 
Aptitude  and  interest,  728  f.,  730  (. 

Area  Visualization,  CP815A,  423  f. 
Arithmetic  Problem  Solving,  AI215A 
556-558 

Arithmetic  Reasoning: 

CI206B,  92-04 
C120GC,  90-92 

Arithmetic-reasoning  tests.  111 
evaluation,  94  L,  122 
validity,  92-94 
Arithmetic  Speed  Test,  558 
Armament  trainees,  65,  312,  36S,  4S2,  484 
485,  784  f. 

Armorer.  78,  83,  92.  303,  326,  352,  355 
357,  378.  402,  528,  529  L,  S31,  S34 
Ascendance-submission,  738 
Attention  in  aircrew,  541  f. 

Attention  Test,  CP403A.  549-551 


Index 

Attention  tests,  evaluation,  860  f. 
Attitude  tests,  736-765 

evaluation,  757,  759,  761,  763,  765  f. 
validity,  757,  759,  760  L,  762  f,  764  f. 
Atypical  Behavior,  CE708A,  656  f. 
Aviation  Cadet  Preference  Scale, 
(-E509A,  733 

Aviation  Cadet  Training  Preference 
Blank: 

CE501A,  732 
CESOlb,  732 
CE501E,  723-732 

Aviation  Preference  Check  List,  746-749 
Avocations,  validity  of,  346 

B 

Behavior  Preference  Questionnaire, 
CE432A,  711  f. 

Bemreuter  Personality  Inventory,  572, 
767,  CE433A,  588  f. 

Bias,  falsification  in  scores,  778  f. 
Biographical: 

data,  and  pilot  aptitude,  780  f. 
method,  767-79S 
theory,  768 

ographical  Data  Blank: 

CE602A,  595,  770-773 
CE602B,  773-775 
CEGQ2B-SA,  77S  f. 

CE602SAB,  776  t 
CE602D,  777-787 
CE602E,  787-790 
CE602F,  791 
CE6Q2FNV,  790 
CE602W,  786  f. 
iog-aphical-data  tests,  573  »• 

evaluation,  773,  775.  776,  777,  TEA, 
790,  794  f,  866 

validity,  772  f.,  774  f,  77b,  777,  781  L 
785  f.,  788-790,  793 
,  Biscrial  r,  30  L,  34  f. 

Bombardier: 

biographical  background,  769 
criterion,  17 
D-8,  86 

intelligence  and,  47 
job  requirements,  3-5,  73^  299  f. 
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C 

Camouflaged  figures,  CP810A,  -134  f. 
Camouflaged  Outlines,  CP821A,  430  f 
Camouflaged  Words,  CI323A,  139  f. 
Carefulness : 

factor  ( Stt  Factor) 
tests,  680-686 
evaluation,  695,  7i9,  864 
recommended,  868 
Case  summaries,  validity,  584 
Change-of-set  tests,  556-563 
evaluation,  563,  564,  860 
validity,  563 

Check  List,  CE506X,  346  ft 
Circular  error,  17,  76 
Civil  Aeronautics  Administration,  767 
Classification: 
battery,  867  ft 
optimal,  875  f. 
problems,  875  f. 

Clerical-speed  tests,  384-407 

evaluation,  386,  388,  392;  393.  394, 
399.  403,  404,  405,  406  t 
validity,  386  f.,  388  f..  390  f ,  394, 
399-402 
Clinical: 

impressions,  valid:ty,  631  f. 
predictions,  intercorrelations,  588, 
64’.  655,  659,  664 
piv.edures,  623-671 
evaluation,  669-671,  863 
j'  oject,  623  t- 
,’udies,  3 

r>_K  Reading  Test,  CP527A,  562  f. 

-ode  Analysis,  CI653AXJ,  213-215 
Cede  Deciphering  Test,  558  L 
College  requirement,  A.  A.  F.,  46 
Combat  Planes,  CI655AX5,  206-210 
Combat  readiness,  757-759 
Con'Munality,  factorial,  822 
Compass  Directions,  CP524A,  525  f. 
Compass  Orientation,  CI660A,  523  f 
Compass-crientation  tests,  512-527 

evaluation,  514,  516,  517,  520,  523, 
524,  526,  539 

validity,  514,  515,  520,  522.  524,  526 
Competitive  Planning,  CI409AX2,  177- 

180 

Complex  Concent  ration,  CI6SSAX1.  210- 
212 

Complex  Coordination  Te>f.  CM701A, 
122,  F2.  215,  22S.  477  f. 

Comp’e.'  Scale  Reading,  CE454A,  633  f 


C  oinprehension: 
defined,  47 
verbal  ( Stt  Factor) 

Computation  tests,  80-87 
evaluation,  84,  86,  87,  853 
validity,  82,  83,  86 
Computer,  E-6B,  72 

Conduct  of  the  War  Test,  CE520A,  761- 
763 

Conference  for  Interpretation  CE707A, 
652  656 
Confidence: 

measures,  703-713 
evaluation,  704  f..  707  f,  710 
712 

validity,  704  f,  7lOL  7ll 
score,  551 

tests,  evaluation,  864  f. 

Constancy,  size,  466 

Control  Confusion  Test,  CE214A,  663- 

66S 

Cooperation,  ratings,  666  f. 

Coordination,  psychomotor,  (Stt  Factor) 
Correction: 

for  guessing,  2<X  33  f. 
for  restriction  of  range,  36  L 
Correlation : 

analysis,  18-201  22  i 
matrix 

carefulness  battery,  687 
December  1942  battery,  799 
foresight  and  planning  I,  I8t 
foresight  and  planning  II,  183 
integration  battery,  216  f. 
judgment  and  reasoning,  148 
July  1943  battery,  801 
mechanical  battery,  334 
memory  battery  I,  262 
memory  battery  II,  262 
November  1943  battery,  902 
perceptual  battery  I,  409 
perceptual  battery  II,  410 
reasoning,  114 

September  1944  battery,  S0J,  904 
Sherpard  Field  battery,  901  ff. 
Crawford  Bennett  Point  Motion  Test, 
285-287 
Criteria: 

bombardier,  17 
navigator,  16  f. 
pilot.  16 

for  temjcrart'.ent  test*.  620 
alidatior^  38  f,  884 
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D 

Decision  difficulty,  616  ft. 

Decoding: 

CI214AXI,  103  f. 

CI214AX2,  26,  100-103 
Decoding  tests: 

evaluation,  103,  104 
validity,  102 

Deduction,  (See  Reasoning) 

Dial  Reading,  CP622A,  393  ff. 

Dial  and  Table  Reading,  CP622A, 
CP621A,  395-403 

Difficulty,  and  factor  composition,  SlO 
Directional-discrimination  test*  479-498 
evaluation,  483,  486,  488,  498 
validity,  483-485,  488  f. 

Directional  Marking,  CP533A,  497  f. 
Directional  Orientation: 

CPStJA,  515  f. 

CPStSB,  C  S 12-514 
CP51SD,  £,  P,  516  t 
Di'ectional  Plotting  CE455A.  6R1  f. 
Discrimination  Reaction  Time: 

CP611D,  described,  801 

diagram,  68 

CP634A.  (Paper),  493-496 
Distance  Estimation,  CP212A,  466-468 
Distance-judgment  tests,  447-474 

evaluation,  449,  451,  452,  454,  4% 
461,  46*  465  f ,  46S,  475,  859 
validity.  449  f ,  452  f .  454  f ,  458, 
46*  465 

Distraction  tests,  549-556 

evaluation,  551,  553,  55',  555  f ,  564 
validity,  SSI,  551,  555 
Dominance,  ratings,  666  f 

Drivirtc  Skill,  CI307AX1,  3_7-3«) 

E 

Fcifnomteal-proccdu- es  u  i<,  161  ' " ' 
Ego  sensitivity,  616  (i. 

Eliininecs : 

inlcrVi-Uss  with,  9  f. 

kinds  of,  36 

EnMional-'-t.jlsility  ratings,  (VV)  f 

Empaflietie  U<'|'Ou\c  Tc  t  i.7 ISA,  6'6- 

619 

Empathy,  defined,  646 
I'sUmaiioii  of  I.-ngtb: 

(  I'ftJI  A,  26,  :-\v  4(6 
illusions  in.  438 


F 

Face  validity,  230  f. 
defined,  2! 

Factor : 

analysis,  18-20,  22  f.,  39-42, 
in  the  AAF.,  477  f. 
assumptions,  39 
advantages,  797,  489  f. 
basic  equations,  839,  877 
carefulness  tests,  686-695 
centroid  extractions,  41 
December  1942  battery,  798-8! " 
diagonal  entries,  40 
integration  tests,  215-225 
and  job  analysis,  844  f. 
judgment  tests,  142-154 
July  1943  battery,  798-819 
mechanical  tests,  321,  333-339 
memory  tests,  261-268 
November  1943  battery,  800- sis' 
perceptual  tests,  408-418 
of  pilot  criterion,  839-845 
planning  tests,  180-190 
procedures  39  f. 
reasoning  tests  113-122 
rotation  of  axes  41 
September  1944  battery,  80?d 
of  stanines  819  f. 
of  training  criteria,  876  i. 
carefulness  691, 
defined,  823 

composites  validity,  843  « 
composition,  and  speed.,  «j0  f. 
genera!  reasoning  (re -zoning  I). 
119,  147,  151,  187,  223,  /6ft  3.V  f . 
41ft  691.  818 
defined,  838 
integration  !,  222, 
defined,  823 
integration  II,  22* 
defined,  823 

integration  III,  189,  224, 
defined,  823 
judgment,  152  f.,  189, 
defined,  823 
kinesthetic-motor,  818 
length  estimation,  337  f,  418, 
defined,  823 

mathematical  background,  818, 
denned,  823 

mechanical  experience,  151,  188,  221, 
336,  61  o,  3  i  6,  defined,  823 

Memory  1,  ( See  paired-ass, states 

memory) 


910 


Memory  II,  (See  visual  memory) 
Memory  III,  26 7 
defined,  823 

numerical,  1x3.  186,  221,  416,  690,  81S 
paired-associates  memory.  (Memory 
I).  265, 
defined,  823 
pilot  interest,  817, 
defined,  838 
planning,  189, 
defined,  838 

perceptual  speed,  118,  186,  220,  264, 
336  !.,  415,  814, 
defined,  823 

psychomotor  coordination,  (psycho¬ 
motor  I),  222,  691,  816  f, 
defined,  838 

psychomotor  precision,  ( psychomo¬ 
tor  II),  692,  817, 
defined,  838 

psychomotor  speed,  (psychomotor 

HI).  223, 
defined.  838 

reasoning  I,  (See  general  reasoning) 
reasoning  II,  120,  151 
defined,  838 
reasoning  III,  120 
defined,  838 

as  reference  category,  797 
social-science  background,  819, 
defined,  838 

space  I,  (See  spatial  relations) 
space  II,  417, 
defined,  838 
space  III,  693, 
defined,  838 
spatial,  478  t.,  '09  f. 
spatial  relations,  (space  I),  119,  187, 
224.  266.  338,  417,  693  f..  815, 
defined,  838 

speed  and  strength  of  closure,  4-15 
summary  of  loadings,  821-838,  892- 

899 

test  batteries,  874  f. 
validities,  820  f., 
negative,  811  f. 

verbal.  120.  152,  18 8.  221,  265.  337. 
813  f, 
defined,  838 

visual  memory,  266,  416 
visualization,  119,  152,  183,  220,  2t>?, 
338,  415.  692,  815, 
defined,  838 

Faculty  board,  defined,  1 


['alsifiiution  in  inventories,  778  f. 

Fear  tests,  695-703, 

evaluation,  6*99,  702  (.,  864 
validity,  697 

Figure  An, logics,  CI2I2AXI,  104-106 
Figuic  an.i’ugics  tests,  104-106,  146, 
evaluation,  106 
validity,  106 

Figure  Classification,  C121JAXI,  106- 
109 

Figure  classification  tests: 
evaluation,  109 
validity,  108 

Figure  Similarity  Test,  560  i. 

Finger  Dexterity  Test,  CM1I6A,  804 
Flanagan  r,  30  f. 

Flexibility  of  attention,  hypothesis,  556 
Flight  Formations,  CI654  \X5,  i96-199 
Flight  Orientation,  CT528A,  488-191 
Flight  Path,  CP105A,  292-294 
Flueiuy  tests,  «35-i42, 
evaluation,  142 

Filing  evi'.untion  board,  defined,  1 
Following  Directions,  CP102A,  347  549 
Following  Oral  Directions: 

Cl 651  AX,  517-520 
CP651BX,  521-523 
CI651CX,  523 

Forced  landings,  CI652A.  202-206 
Foresight,  defined,  47 
Foresight  and  planning: 
in  pilot  training,  157  i. 
tests.  157-180 
evaluation,  855 

Fciesight  and  Planning  Mare  Test, 
CMOS  A.  165 

Form -perception  tests,  419-444, 
evaluation,  859 

I'co  itioo  Visualization,  CP814A,  281  f. 
Fiustit'ti.iii,  military  service  and,  737 

C 

C_iier.il  Information: 

cT.miJO,  350 
i  i-  .x)5D,  359-3ol 
i’F5f>5K,  361-366 
C  !  5uSF.  366  f. 
r  I  r  .'FX2,  3o7  f. 

{  i-  s)5FXJ,  368 
t  i-iVGXl.  3Co 
CK5«VfiX2  367  f. 

,  i;;ov;x3,  676  f. 

<  KVisCXt.  t>71  (. 
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CE505GX5,  6,  675 
CE505GX7,  676 
CE505GX8,  368  , 

General-information  tests,  350-368,  S72  f., 
674-677 

evaluation,  358  f.,  360  f.,  366,  369, 
858 

validity,  352,  354  f.,  357  f.,  360,  3M  f„ 
366,  367,  368 
General  reasoning: 

factor,  (See  Factor) 
theory,  339 

Genotypic  descriptions,  14 
Geographical  Memory,  CI508AX,  257- 

259 

Geography,  AS104,  34,  763 
Geometric  illusions,  439  ff. 

German  military  psychology,  665 
Gestalt,  illusions,  443  ff. 

Goodenough  Speed  of  Association  Test, 
678 

Gottschaldt  Figures,  QP901A-III,  411, 

430 

Grade-slip  entries,  3 
Graph  Reading,  CP601B,  384-386 
Graphic  rating  scale,  evaluation,  723  f. 
Guilford-Martin  Personnel  Inventory, 
CE436A,  592-595 
score  intcrcorrelations,  594 
Gunner,  flexible,  60,  78,  83,  86,  308,  326, 
381,  402,  515 
Gunnery,  fixed,  75 

H 

Harrtwer -Erickson  Rorschach,  634-637 
Heterogeneous  tests,  380  f. 

Hidden  Figures,  CP802A,  430 
History,  AS153,  763 
Homogeneous  tests,  880  f. 

Home  Front  Attitude  Inventory, 
CE446A,  763-765 

Huinm-Wadsworth  Temperament  Seale, 
CE418A,  581-585 

I 

Illusions: 

geometric,  439 
Gestalt,  4-43 
tests,  438-444 
evaluation,  441,  442  f. 


Identification  of  Velocities  II,  CP205B, 
408  f. 

Indices  of  Self  Confidence,  CE427A,  703- 
706 

Induction,  ( See  Reasoning) 

Information: 

in  judgment,  145 

tests,  48  f.,  (See  General-informa¬ 
tion  tests) 

Information  Blank  S-C,  CE410A,  578- 
580 

Instructions,  experiment  on,  778  f. 
Instructor-student,  matching,  619 
Instrument  Comprehension: 

I,  CI615B,  479-486 
II,  CI616B,  479-486 
Integration:  * 

ability,  in  pilot,  191  i. 

I,  II,  or  III,  (See  Factor) 
ratings,  666  f. 

validity  of,  584 
tests,  49,  191-215 

evaluation,  195,  199,  202,  206,  209  f., 
212,  215,  225,  855  f.,  860  f. 
recommended,  869 
requirements,  192 
validity,  195,  198,  201,  205,  212 
Intellect,  aircrew  requirements,  46 
Intellectual  tests,  45-49 
defined,  45 

Intelligence  quotient,  852,  885 
Interaction  Test,  CE425B,  665-66 7 
Interest : 

and  aptitude,  728  f.,  730  i, 
fighter  vs  bomber  pilot,  731  f. 
ratings,  722  ff. 

evaluation,  734-736 
validity,  725  t.,  734 
scores,  550  f. 
and  stanine  validity,  729 
tests,  608-616,  736-765, 
evaluation,  740,  744,  748,  752,  754, 
766,  862,  865  f. 

validity,  739  f.,  742  f.,  747  f.,  751  f. 
Internal  consistency,  28-34, 
equations  for,  889 
Interview: 

method,  652-656 
psychiatric,  605  f. 
validity,  655 

Introversion-extroversion,  589  If.,  738 
Inventory  of  Attitudes,  CE518A,  759-761 
Inventory  of  Experiences,  Interests,  and 
Attitudes,  CE612AX2,  749-752 
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Inventory  of  Factors  GAMIN,  CE4J5A 
595-599 

score  intcrcorrelalions,  597 
Inventory  «,f  Factors  STDCR,  CE434A 
589-592 

score  intcrcorrelations,  590 
Item: 

analysis,  22, 
against  factors,  879  f. 
difficulty,  33  f. 

intercorrelations,  equations  for,  887  {. 
selection,  principles,  361 
validity,  37  L 
and  aptitude,  780  f. 
equations,  889 
writing,  21  f. 

Item-test  correlations: 
comparison,  30  f. 
equations,  889 

i 

Job  analysis: 

by  factor  analysis,  844  f. 
oombardier,  3-5 
formal,  2 
navigator,  5-7 
pilot,  7-11 

and  test  construction,  13  f. 
Judgment: 

in  aviation,  123  f. 
commonsense,  143 
defined,  47 
factor,  (See  Factor) 
logical-reasoning,  144 
mechanical,  144 
tests,  123-154 

evaluation,  127,  123,  130,  154  f., 
854  f. 

recommended,  869 
validity,  127,  128,  153 
Judgment  of  Proportions,  CP206B,  472- 
474 

Judgment-of-proportions  tests,  472-474 
evaluation,  474,  475 
validity,  474 

K 

I 

Kinestbctic-motor,  factor,  818 
Kocrth  pursuit  test,  813 
Kuder-Richardson  formulas,  28 
Kuder  Preference  Record,  CE515A,  613- 

616 

score  intercorrelations,  61S 


L 

1  anding  Judgment,  CP505B.  4?1  f. 
landings.  psychological  features,  158,  191 
I  c.idership  tests,  665  f.,  713-719 
evaluation,  715  f,  718  f.,  865 
1-cugth  estimation: 

factor,  (See  Factor) 
tests,  recommended,  869 
I.ice  of  Flight  Test,  CP102A,  292 
E‘»;i  Hook  Accuracy,  406  f. 

1  «>Kical  Sequence,  C1217A,  98  f. 

M 

Maller-Glaser  Interest  Values  Inventory, 
CE514A,  611-613 
score  intcrcorrelations,  613 
Map  Distance,  CP626B,  456461 
Map  Memory: 

CIS05AXI,  233  f. 

CI505AX2,  238  f. 

C1505AX3,  236-238 
CI505BX1,  235  f. 

Map  Planning,  C1412AX,  164-167 
Marking  accuracy,  404  f. 

Masculinity: 

in  air  crew,  673  f. 
pilot  validity,  599 
tests,  673-680 
evaluation,  677,  680,  863  f. 
Mathematical : 

background,  factor,  (See  Factor) 
tests,  recommended,  870 
tests,  71-88 

evaluation,  78,  87  If.,  853 
validity,  75-78 
Mathematics; 

CI702C,  79 
CI702E,  73-78 
CI702F,  79 
C1702GXI,  79  f. 

Mathematics  A,  (5V*  Mathematics 
C1702C-G) 

Mare  Coordination  Test,  CM1J8A,  554— 

556 

Maze  Tracing  Speed  Test,  558 
McQuarrie  test,  382 

Mechanic,  air,  60,  65,  78,  83,  308,  31& 
326,  352,  355,  357,  365,  381,  402;  464, 
528,  529  f„  531,  5tt,  784  f. 

Mechanical: 

ability 

in  aircrew,  2.8-300 
causes,  300  f. 
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measurement,  300 
comprehension  tests,  145,  301-322 
evaluation,  .109  313,  315  f,  319, 

322,  339 

validity,  306-308,  312,  315,  319 
experience : 

factor,  (.See  Factor) 
tests,  reci/ni mended,  870 
iiiff-i  maiion  tests,  322-333 
evaluation,  324-327,  330,  332,  333, 
339 

validity,  324-326,  33ft  332 
movements  tesiz,  146 
tests,  301-333 
evaluation,  857  f. 
validity,  153 
Mechanical  Functions: 

C1907AX,  313-319 
CI907B,  317 

Mechanical  Information : 

CI905A,  34,  323-327 
C905AX1,  326  f. 

CI905AX2,  327 
CI905B,  327 
C1905BX,  32? 

CI905BX1,  34 
Mechanical  Knowledge,  326 
Mechanical  Movements: 

CI904A,  320 
CI904AX,  319  f. 

CI904AX1,  319 
CI904AX2,  317-322 
Mechanical  Operations,  316 
Mechanical  Principles: 

CI903A,  302-310 
C1903B,  310-313 
Memory: 

abilities,  hierarchy,  267 
in  aviation,  227-229 
deficiencies,  228 
defined,  47 
research  plan,  229  f. 
score,  551 
tests,  227-261,  550 
evaluation,  234,  236,  237,  240,  241, 
243,  244,  246  f.,  249  f.,  252  f., 
254,  257,  259,  261,  856 
pictorial,  231-254 
recommended,  870 
symbolic,  254-261 
validity,  234,  235,  237,  240,  216, 
249,  252,  254,  257.  259,  261 
Memory  for  Landmarks,  CI501AX1,  248- 
251 


Memory  I,  If,  or  III,  (See  Factor) 
Memory  for  Plane  Designations, 
C! 507 AX,  260  f. 

Memory  for  Plane  Silhouettes, 
CI50JAXI,  244-248 
Memory  for  Ships,  CI504AX,  253  f. 
Memory  for  Tactical  Plans: 

CI509AX1,  267 
CI509BX,  255-257 
Mental  set,  factors,  225 
Meter  Reading,  CP602B,  386-388 
Minnesota  Multiphasic  Peisonality  In 
ventory,  CE437A,  599-601 
Minnesota  Personality  Scale,  CE438A. 
601-603 

score  intercorrelations,  603 
Minute  Difference  Discrimination  Test, 
561  f. 

Morale,  tests,  755-765 
Motivation : 

knowledge  and,  341 
measures,  723-765 
evaluation,  865  f. 
tests,  573  f. 

Mutilated  Words,  CP512A,  424  f. 


Navigator : 

biographical  background,  769 
criterion,  16  f. 

grades,  correlations,  60,  77,  83,  94, 
96,  102,  106,  108,  112,  276,  282,  306, 
326,  358,  380,  401,  484,  485,  530, 
534 

intelligence  and,  47 
job  requirements,  5-7,  72 
Navy,  U.  S.,  342,  733,  746,  767 
Nearest  Point,  CP607,  451  f. 
Neuroticism,  738 
Normality  of  Perception: 

CP806BX3,  443  f. 

CP806CX3,  442-443 
CP806CX4,  442-443 
Number-series  tests,  95-100 
evaluation,  97,  99  f. 
validity,  96,  98 

Number  Filing,  CP604B,  392  f. 

Number  Reading,  CP6043,  392  f. 
Number  Series,  CI215AX1,  95-97 
Number  Size,  CP605B,  393  f. 
Numerical: 

factor,  (See  Factor) 
tests,  recommended,  871 
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Numerical  Approximation,  CI706A  84- 
86,  94 

Numerical  approximation  t«-<ts,  (sVe 
Compulation  tests) 

Numerical  Operations: 

0701 A,  552 
CI701B,  80-84 

Numerical  operations  tests,  (See  Com¬ 
putation  tests) 

Numerical  Sequence,  CI217A,  98  f. 

O 

Object  Completion,  CP811A,  425-428 
Object  Identification,  CP521A,  26,  499- 
501 

Object  Recognition,  CP523A,  501-505 
Objectivity  of  Perception: 

CP806BX1,  443  f. 

CT806CX1,  .2,  439-443 
Observation  during  Rest,  CE709A,  657- 
660 

Observational  Stress  Technique,  CE710A, 
660-663 

Observational  techniques,  573,  652-669 
evaluation,  656,  659  (.,  662  f.,  664  f., 
667,  668,  862  f. 

validity,  651  f.,  659,  662,  664,  666 
668 

Occupational  Experience  Blank,  CE603A, 
791  f. 

Officer  candidate,  92,  402,  484,  485 
Organizational  Planning : 

CI407AX,  169  f. 

CI407BX,  167-170 
Orientation : 

in  air  crew,  511  f, 
tests,  512-539 
evaluation,  860 
types,  512 

P 

Paired-associates  memory  factor,  ( See 
Factor) 

Paper : 

folding  test,  277-279 
form-board  test,  336,  421 
Paranoid  traits,  592 
Paratroop  Dropping,  CI209A,  403  f. 
Path  Length,  CP628,  461-463 
Path  Tracing,  CP512A,  382-384 
Pattern : 

analysis  tests,  428-438 
evaluation,  430,  434,  436,  438,  445 
validity,  429,  434,  435,  437 


completion  tests,  424-428 
evaluation,  425,  428.  444  f. 
validity,  425,  428 
formation  tests,  419  424 
evaluation.  420  f..  123,  124,  445 
validity,  420,  422 
orientation  tests,  527-539 
evaluation,  528,  531,  532,  536,  538, 
539 

validity,  529  f.,  533  f.,  538 
reasoning  test,  146 
Pattern  Analysis,  CP5I2A,  428  f. 

Pattern  Assembly,  CP804A.  336,  421-423 
Pattern  Comprehension: 

CP803A,  271-274,  336 
CP803AXI,  B,  273 
Pattern  Sequence,  CI2J7B,  109  l 

Penetration  of  Camouflage,  CP812A  431- 
434 

Perception : 

aircrew  requirements,  371 
factors,  375  f. 

tests,  code-number  system,  371  f. 
Perceptual  speed: 
in  aviation,  376 
factor,  ( See  Factor) 
tests,  375-408 

evaluation,  380,  384,  407  f,  858  f. 
recommended,  871 
validity,  379-381,  383 
Personal  Audit,  CE413A,  585-588 
Personal  Data  Form,  CE605A,  792-794 
Personal  inventories,  578-608 

evaluation,  580,  584  (.,  588,  589,  592, 
595,  598  i,  601,  603,  606  f„  861  t. 
validity,  579  f„  582-584,  587, 589, 591. 
594,  597  f.,  600  f„  603,  605 
Personality: 
defined,  565  f. 
inventories,  577-621 
quotient,  885 
structure,  568  f. 

Phenotypic  descriptions,  13  f. 

Phi  coefficient,  29  If.,  38 
Physical  Principles: 

CI801AX,  332  f. 

C1801BX,  330-333 
Physics.  CI801A,  332 
Picture  Integration,  CP101A,  419-421 
Picture  Evaluation  Test,  CF.712A,  619  f, 
Picture  Exercises  Test,  634-637 
Picture  Judgment  Test,  CE651  f. 

Picture  Sequence  Test,  CE7I3A,  650  f. 
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Pilot : 

B-17,  60,  75,  306,  325,  357,  379,  400, 

533 

1 1-24,  60,  75,  82.  306.  325,  352,  355, 
357,  379,  <M»,  533 

B-25,  60,  75,  306,  325,  352,  355,  357, 
379,  400,  S33 

B-26,  60,  7S,  82,  306,  325,  352,  355, 
357,  379,  400,  533 
biographical  background,  768  f. 
bomber  vs  fighter,  10,  75,  82,  306, 

325,  379,  381, 400, 402,  529  f„  533  (.. 
731  (.,  743-745,  750  f ,  752  ff.,  766, 
790 

criterion,  16 

(actor  composition,  839-845 
intelligence  and,  56  f. 
interest  (actor,  (See  Factor) 
job  requirements,  7-11,  73,  299 
judgment,  123  (. 
landing  study,  447  f. 

75,  82.  306,  325,  379,  400,  533 
single-engine,  60,  75,  381,  402,  5301 
534,  (Set  also  Pilot:  bomber  vt 
fighter) 

specialisation,  743-745,  750  f,  752  ff., 
790,  (See  also  Pilot:  bomber  vs 
fighter) 

twin-engine,  (See  Pilot,  single-en- 
gine) 

validity,  prediction  o(,  846-849 
Pilot  Behavior  Blank,  CE444A,  716-718 
Pilot-interest  tests,  recommended,  871 
Plane  Formation,  CP805B,  411 
Plane  Formation  Memory,  CI513A,  266 
Plane  Name  Memory,  CI506AX2,  250- 
253 

Plane  Position  Memory,  CI512A,  241— 
243,  266 
Planning: 

defined,  47 
factor,  (See  Factor) 
pathway,  159 

tots,  159-180,  (See  also  Foresight) 
evaluation,  162,  164,  167,  169,  173  (., 
177,  180C  190 
recommended,  871 

validity.  161  f,  164.  166,  169.  173, 

177,  180 

Planning-by-deduction  tests,  177-180 
Planning  Air  Maneuvers: 

CI406AXI.  172-»74 
C1408AX2.  172-174 
C1408AX3,  179-174 


Planning  a  Circuit,  CI401A,  162-164 
Planning  a  Course,  CI406AX3,  192-196 
Planning  by  Deduction,  CI409AX1,  180 
Planning  Maze  Test,  CI405A,  159,  165 
Plotting  Accuracy  Test,  CE453A,  6f6 
Plotting  Test,  CE452A,  684-686 
Point  biscrial  r,  30  (.,  35 
Position  Orientation,  CP526A,  505-509 
Position  Visualization,  CP534A,  288-290 
Position  Visualization  II,  CP111A,  290- 
291 

Positional-discrimination  tests,  498-509 
evaluation,  501,  505,  508  (. 
validity,  501  ...  504,  508 
Power  tests,  881  f. 

Practical-estimations  tests,  131-135 
evaluation,  133,  135,  154 
validity,  133,  134  L 
Practical  Estimations: 

C1308AX1,  132  f. 

CI308BX1,  133-135 

Practical-judgment  tests,  (See  Judgment 
tests) 

Practical  Judgment: 

CI301BX1,  125-127 
CI301BX2;  129-131 
CI301C,  131 
CI301DX1,  131 

Precision,  psychomotor,  (actor,  (See 
Factor) 

Preference: 

blanks,  723-736 
and  graduation  rate;  726 
inventories,  608-619 
evaluation,  611,  613;  616.  619, 
620  t 

validity,  610  (.,  6a  61S.  617  (. 
ratings 

evaluation,  865 
validity,  725  (.,  734 
statement,  721  f. 
waiver.  722,  724,  72S,  732 
validity,  727  (. 

Pressure;  work  under,  552  (.,  554 
Primary  school,  defined,  2 
Projective: 

methods,  625-652 

evaluation,  633  (.,  637,  642,  645  f., 
649.  862 

validity,  631-633,  636  f,  640  (, 
644,  648 

tests,  S73,  (See  also  Rorschach,  and 
Thematic  Apperception) 
Psychomolor : 
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coordination,  {actor,  (See  Factor) 
factor  I,  IT,  or  III,  (Set  factor) 
precision,  factor,  (See  Factor) 
speed,  factor,  (Set  Factor) 
tests,  recommended,  871 
Psychomotor  Instruction  Comprehension 
test,  CI626B,  66-69 
Pursuit  Test,  CP414A,  382-384 

Q 

Qualifying  Examination: 

AAF,  51,  71.  124,  143,  164,  28-1,  350, 
430 

Aviation  Cadet,  46 

Quantitative  Estimation,  CE410A  >08- 

711 

Quantitative  perception  tests,  384  flf., 

448  ff. 

E 

Radio  operator,  60,  78,  83,  308  326,  .352, 
355,  357,  378.  381,  402,  529  f,  S33  f. 
Rapid  Projection  Test: 

CE711A,  643 
CE711B,  643 
CE711C,  642-646 
Ratings: 

of  appearance,  667  f. 
of  behavior,  662,  666 
evaluation,  862  f. 

intercorrelations,  588,  641,  5!>5,  659, 

664 

interests,  722  ff. 
preference,  865 
validity,  725  f.,  734 
self,  of  performance,  703  < 
validity,  584,  641 
Rationalization,  738 
Reaction-formation,  738 
Reaction  Speed: 

CE451AXI,  677  f. 

CE451AX2.  679  f. 
Reading-comprehension  tests,  5669 
evaluation,  58,  61,  65,  69.  552 
validity,  52.  58.  60.  61.  65 
Reading  Comprehension : 

C1606A  (Training  and  Duties),  a6- 
58 

C1614G,  58-62  , 

CI614H,  62-66 
Realign,  hypothesis,  703 
Reasoning : 

deductive,  122,  144 
inductive,  122 


in  judgment  items,  129 
I,  If,  or  HI,  ( Set  Factor) 
in  reading,  145 
syllogisms,  146 
tests,  89-122 
evaluation,  122,  853  f. 
recommended,  871  f. 
validity,  153 

Reasoning  Test,  CI21SA,  97  f. 

Reliability,  26-28 

alternate  forms,  25  f. 

equation  for,  877 

and  factor  variance,  822,  877-879 

maximum,  20 

odd-even,  27 

time  interval  and,  26  f. 

uses,  27  f. 

and  validity,  878  « 

Restricted  Word  Association  Test, 
CE702B,  607  f. 

Reversed  Clock  Test,  559  f. 

Rorschach  Test,  CE701A,  625-634 
evaluation,  862 
examiner  differences,  630  f. 
group  administration,  634-637 
time  of  day  and,  631 

Rotary  Pursuit,  CM803A,  813 
diagram,  67 

Route  Planning,  CI41IAX,  159-162 

Rudder  Control  Test,  CM120B,  813 

S 

Satisfaction  Test: 

CE409A,  736-740 
CE409B,  C,  740-74S 
CE409D,  745  f. 

Scores,  rights  and  wrongs,  20,  135,  21 2, 
275,  279,  281,  293.  427  f..  433,  435.  437, 
442,  458,  464  f..  470.  488,  500  f.,  502, 
504,  508,  513  ( .  521,  694.  882  f. 

Scoring : 

formulas,  458-460 
factors  and,  19  f.,  460  f. 
optimal,  59,  81,  304  f .  883  f. 
and  reliability,  470 
>tudy,  459-461 

Seating,  cff>cts  on  scores,  438  f. 

Sol  f-confidcnce,  578  ff. 

Self-Crediting  Mental  Abilities,  CE4?J.\, 
706-708 

Sdf-'UfiicitIK).  578  ff. 

Sequence  of  Maneuvers,  CI410A,  174- 
177 

Set,  change  of,  556  ff. 


Shipley  Personal  Inventory,  CE601B  572, 
601607 

Shorter  Line,  CP606,  448-451 
Shortest  Path,  CP608,  452-456 
Shrinkage,  in  multiple  correlation,  767 
Signal  Interpretation,  C1656A,  199-202 
Similarities  Test,  CIJI9A,  137  f. 
Size-estimation  tests,  447-474 
evaluation,  474  f.,  859 
Skill,  knowledge  and,  341  f. 

Sociability,  578  ft. 

Social : 

intelligence  tests,  713-719 
evaluation,  715  (.,  718  865 

science  background  factor,  (See 
Factor) 
sensitivity,  617 

Social  Concepts,  CE512A,  755-757 
Social  Manipulation  Inventory,  CE443A, 
713-716 

Social  Understanding,  test,  756 
Space  factor: 

I,  II,  or  III,  {See  Factor) 
theory,  479,  499 
Spatial : 

ability,  concept,  477 
reasoning  tests 
evaluation,  112,  113 
validity,  112,  113 
relations 

factor,  ( See  Factor) 
theory,  283 
tests,  479-509 
evaluation,  859  f. 
recommended,  872 
Spatial  Orientation: 

CP501A,  2A,  3A,  532 
CP501B,  30.  527-533 
Spatial  Reasoning: 

CI2U0XI,  110-113 
C121IHN2,  112  f. 

Spatial  Visualization  I,  CI204AX1,  277- 
279 

Spatial  Vi-ualization  if,  C1203A,  274- 
277 

Spatial  Visualization  JU,  CP535,  287  f, 
Spearman-Hrown  formula,  27 
Specialization  Interest  Inventory, 

(T.609A,  755 

Specialization  Prcferiuce  Inventory, 

CF.6I0A,  752-755 
Speed : 

of  decision,  866 

factor  composition  and,  !60  f. 


572,  psychomotor,  (See  Factor) 
tests,  460  f,  881  f. 

Speed  of  Identification,  CP610A,  376- 
382 

>7  Speed  Estimation  II,  CP2Q5B,  408  l. 

!02  Sports,  and  pilot  selection,  349  f. 

Sports  and  Hobbies  Check  List, 
CES06X,  .346-348 
Sports  and  Hobbies  Participation: 
CE506D,  343-350 
CE506E,  343-350 
Sports-and-hobbies  tests,  342-350 
evaluation,  348  f. 
validity,  345 

(See  Stanine: 

augmented,  37 
defined,  35 

factor  analysis  of,  819 

43A,  intereorrclations,  876 

validity,  and  interest,  729 
Star  Identification,  CP519B,  536-539 
Statistical : 

procedures,  25-42 
rationale,  887  ff. 
symbols,  42  f.,  887 

Stick  and  Rudder  Orientation,  CP531A, 
492  f. 

Street  Gestalt  Completion  test,  426 
St  rength-of -interest  scale,  722 
Stress  Resolution,  CE441A,  700-702 
Strong  Vocational  Interest  Blank  for 
Men,  CE503A,  608-611 
Structural  Answer  Projection  Test,  CE 
714A,  651 

Student-instructor  matching,  619 
Surface-development  test,  271 
Survey  of  Aviator  Opinion: 

CE604A,  699  f.,  776 
CE604B,  695-699 
CE604C,  700 

Survey  of  Personal  Attitudes,  CE508B, 
277-  757-759 

Sustained-attention  tests,  542-563 
274-  evaluation,  547,  548  f.,  563  f. 
validity,  546,  SI8 

187  f.  Syllogism,  (See  Reasoning) 


Table  Reading: 

CP621/V  396  f. 

CP60J13,  388-392 

Teacher  Preference  Scale,  CE426A,  616- 
619 

score  iutcrcorrelation,  618 
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abilit)-,  aircrew  requirements,  52  f. 
factor,  (See  Factor) 
tests,  51-70;  (See  also.  Reading  Cora* 
prehension,  and  Vocabulary) 
evaluation,  852 
recommended,  873 
Verbal  Recognition,  CI322B,  137 
Visual: 

completion  tests,  292-295 
evaluation,  294  L 
validity,  294 

manipulation  tests,  |7!-292 
evaluation,  274,  277,  279  f,  283  f, 
285.  287,  291  f.,  295  f. 
validity,  273.  276,  2»,  282 
memory  factor,  (See  Factor) 

Visual  Memory,  CIS14A*  240  f. 
Visualisation: 

in  aviation,  270  f. 
fae'or,  (See  Factor) 
history,  269  f, 
tests,  4«,  271-29S 
evaluation,  856  f. 
recommended,  873 
theory,  283,  295  f. 

Visualization  of  Maneuvers:  , 

CI657AXI,  280-284 
C1657C,  30 

Vwtoalization^lafciptc  Choice.  CE701B, 
634-637 

Vocabulary  Pressure  Test,  CE2D1A, 
553  f. 

Vocabulary  Test: 

AAF,  0604 B,  5S-56 
CI604A,  53-55 
CI605A,  53-55 
Cooperative  Form'll.  S3— SS 
Vocabulary  tests  53-56 

evaluation,  54,  56,  69,  852 
validity,  S2,  54.  S6v  1S3 
Vocational : 

guidance,  problems  884  f. 
interest  tests  608-616 
sd'-etion,  problems  884  I. 

W 

West  Point  Cadets,  64.  81,  91,  37*  783 
Wiggly  Blocks  test.  665 
Women’ 1  Armv  Service  Pilots  (WASP). 
92.  312,  361  400,  484  f,  529,  533, 

604-607,  784-787 

Word  Association  Test,  CI3I8A,  138  f. 
Work-in-Flight  Test.  CE41SA,  551-553 
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